Lesson 20 Flashcards

1
Q

High Availability

A

High
availability is usually loosely described as 24x7 (24 hours per day, 7 days per week) or
24x365 (24 hours per day, 365 days per year). For a critical system, availability will be
described as “two-nines” (99%) up to five- or six-nines (99.9999%):

Availability
Annual Downtime
99.9999%
00:00:32
99.999%
00:05:15
99.99%
00:52:34
99.9%
08:45:36
99.0%
87:36

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Maximum tolerable downtime (MTD)

A

The maximum tolerable downtime (MTD) metric
expresses the availability requirement for a particular business function.

High
availability is usually loosely described as 24x7 (24 hours per day, 7 days per week) or
24x365 (24 hours per day, 365 days per year). For a critical system, availability will be
described as “two-nines” (99%) up to five- or six-nines (99.9999%):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Scheduled service intervals versus unplanned outages

A

Downtime is calculated from the sum of scheduled service intervals plus unplanned outages over
the period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Scalability

A

Scalability
•Increase capacity within similar cost ratio
•Scale out versus scale up

Scalability is the capacity to
increase resources to meet demand within similar cost ratios. This means that if service
demand doubles, costs do not more than double. There are two types of scalability:
• To scale out is to add more resources in parallel with existing resources.
• To scale up is to increase the power of existing resources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Elasticity

A

Elasticity
•Cope with changes to demand in real time

Elasticity refers to the system’s ability to handle these changes on demand in real
time. A system with high elasticity will not experience loss of service or performance if
demand suddenly increases rapidly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Fault tolerance and redundancy

A

A system that can experience failures and continue to provide the same (or nearly the
same) level of service is said to be fault tolerant. Fault tolerance is often achieved
by provisioning redundancy for critical components and single points of failure. A
redundant component is one that is not essential to the normal function of a system
but that allows the system to recover from the failure of another component.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Power problems
•Spikes and surges
•Blackouts and brownouts

A

All types of computer systems require a stable power supply to operate. Electrical
events, such as voltage spikes or surges, can crash computers and network appliances,
while loss of power from brownouts or blackouts will cause equipment to fail.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Power management

A

Powermanagement means deploying systems to ensure that equipment is protected against
these events [blackouts, brownouts, spikes and surges] and that network operations can either continue uninterrupted or be
recovered quickly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Dual Power Supplies

A

Dual power supplies
•Component redundancy for server chassis

An enterprise-class server or appliance enclosure is likely to feature two or more power
supply units (PSUs) for redundancy. A hot plug PSU can be replaced (in the event of
failure) without powering down the system.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Managed power distribution units (PDUs)

A

Managed power distribution units (PDUs)
•Protection against spikes, surges, and brownouts
•Remote monitoring

The power circuits supplying grid power to a rack, network closet, or server room
must be enough to meet the load capacity of all the installed equipment, plus room
for growth. Consequently, circuits to a server room will typically be higher capacity
than domestic or office circuits (30 or 60 amps as opposed to 13 amps, for instance).
These circuits may be run through a power distribution unit (PDU). These come with
circuitry to “clean” the power signal, provide protection against spikes, surges, and
brownouts, and can integrate with uninterruptible power supplies (UPSs). Managed
PDUs support remote power monitoring functions, such as reporting load and
status, switching power to a socket on and off, or switching sockets on in a particular
sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Battery backups and uninterruptible power supply (UPS)

A

Battery backups and uninterruptible power supply (UPS)
•Battery backup at component level
•UPS battery backups for servers and appliances

If there is loss of power, system operation can be sustained for a few minutes or hours
(depending on load) using battery backup. Battery backup can be provisioned at the
component level for disk drives and RAID arrays. The battery protects any read or write
operations cached at the time of power loss. At the system level, an uninterruptible
power supply (UPS) will provide a temporary power source in the event of a blackout
(complete power loss). This may range from a few minutes for a desktop-rated model
to hours for an enterprise system. In its simplest form, a UPS comprises a bank of
batteries and their charging circuit plus an inverter to generate AC voltage from the DC
voltage supplied by the batteries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Generators

A

A backup power generator can provide power to the whole building, often for several
days. Most generators use diesel, propane, or natural gas as a fuel source. With diesel
and propane, the main drawback is safe storage (diesel also has a shelf-life of between
18 months and two years); with natural gas, the issue is the reliability of the gas
supply in the event of a natural disaster. Data centers are also investing in renewable
power sources, such as solar, wind, geothermal, hydrogen fuel cells, and hydro. The
ability to use renewable power is a strong factor in determining the best site for new
data centers. Large-scale battery solutions, such as Tesla’s Powerpack (tesla.com/
powerpack), may be able to provide an alternative to backup power generators. There
are also emerging technologies to use all the battery resources of a data center as a
microgrid for power storage (scientificamerican.com/article/how-big-batteries-at-datacenters-
could-replace-power-plants/).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Network reducnancy

A

Networking is another critical resource where the a single point of failure could cause
significant service disruption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Network Interface Card (NIC) Teaming

A

Network interface card (NIC) teaming, or adapter teaming, means that the server
is installed with multiple NICs, or NICs with multiple ports, or both. Each port is
connected to separate network cabling. During normal operation, this can provide a
high-bandwidth link. For example, four 1 GB ports gives an overall bandwidth of 4 GB.
If there is a problem with one cable, or one NIC, the network connection will continue
to work, though at just 3 GB.

From Wikipedia: A network interface controller (NIC, also known as a network interface card,[3] network adapter, LAN adapter or physical network interface,[4] and by similar terms) is a computer hardware component that connects a computer to a computer network.[5]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Switching and Routing (for network redundancy

A

Switching and routing
•Design network with multiple paths

Network cabling should be designed to allow for multiple paths between the various
switches and routers, so that during a failure of one part of the network, the rest
remains operational.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Load balancers (for network reducnancy)

A

Load balancers
•Load balancing switch to distribute workloads
•Clusters provision multiple redundant servers to share data and session information

NIC teaming provides load balancing at the adapter level. Load balancing and
clustering can also be provisioned at a service level:
• A load balancing switch distributes workloads between available servers.
• A load balancing cluster enables multiple redundant servers to share data and
session information to maintain a consistent service if there is failover from one
server to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Disk Redundancy

A

Disk and storage resources are critically dependent on redundancy. While backup provides
integrity for when a disk fails, to restore from backup would require installing a new
storage unit, restoring the data, and testing the system configuration. Disk redundancy
ensures that a server can continue to operate if one, or possibly more, storage devices fail

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Redundant array of independent disks (RAID)

A

When a storage system is configured as a Redundant Array of Independent Disks
(RAID), many disks can act as backups for each other to increase reliability and fault
tolerance. If one disk fails, the data is not lost, and the server can keep functioning.
The RAID advisory board defines RAID levels, numbered from 0 to 6, where each level
corresponds to a specific type of fault tolerance. There are also proprietary and nested
RAID solutions. Some of the most commonly implemented types of RAID are listed in
the following table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Raid 1, 5, 6, Nested, and level 0

A

RAID 1
•Mirroring
•50% storage efficiency

RAID 5 and RAID 6
•Striping with distributed parity
•Better storage efficiency

Nested RAID
•Better performance or redundancy

RAID -

Level 1

Mirroring means that data is written to two
disks simultaneously, providing redundancy
(if one disk fails, there is a copy of data
on the other). The main drawback is that
storage efficiency is only 50%.

Level 5

Striping with parity means that data is
written across three or more disks, but
additional information (parity) is calculated.
This allows the volume to continue if one
disk is lost. This solution has better storage
efficiency than RAID 1.

Level 6
Double parity, or level 5 with an additional
parity stripe, allows the volume to continue
when two devices have been lost

Nested (0+1, 1+0, or 5+0)

Nesting RAID sets generally improves
performance or redundancy. For example,
some nested RAID solutions can support the
failure of more than one disk.

Raid Level 0

RAID level 0 refers to striping without parity. Data is written in blocks across several disks
simultaneously, but with no redundancy. This can improve performance, but if one disk
fails, so does the whole volume, and data on it will be corrupted. There are some use cases
for RAID 0, but typically striping without parity is only implemented to improve performance
in a nested RAID solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Multipath

A

Multipath
•Controller and cabling redundancy

Where RAID provides redundancy for the storage devices, multipath is focused on
the bus between the server and the storage devices or RAID array. A storage system is
accessed via some type of controller. The controller might be connected to disk units
locally installed in a server, or it might connect to storage devices within a storage area
network (SAN). Multipath input/ouput (I/O) ensures that there is controller redundancy
and/or multiple network paths to the storage devices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Replication context

A
  • Local storage (RAID)
  • Storage area network (SAN)
  • Database
  • Virtual machine (VM)

Data replication is technology that maintains exact copies of data at more than one
location. RAID mirroring and parity implement types of replication between local
storage devices. Data replication can be applied in many other contexts:
• Storage Area Network (SAN)—most enterprise storage is configured as a SAN. A
SAN is a high-speed fiber optic network of storage devices built from technologies
such as Fibre Channel, Small Computer System Interface (SCSI), or Infiniband.
Redundancy can be provided within the SAN, and replication can also take place
between SANs using WAN links.
• Database—much data is stored within a database. Where a database is replicated
between multiple servers or sites, it is very important to maintain consistency
between the replicas. Database management systems come with specific tools to
implement different kinds of replication.
• Virtual Machine (VM)—the same VM instance may need to be deployed in multiple
locations. This can be achieved by replicating the VM’s disk image and configuration
settings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Geographic dispersal

A

Geographical dispersal refers to data replicating hot and warm sites that are
physically distant from one another. This means that data is protected against a
natural disaster wiping out storage at one of the sites. This is also described as a georedundant
solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Asynchronous and synchronous replication

A
  • Synchronous (must be written at both sites—expensive)
  • Asynchronous (one site is primary and the others secondary)
  • Optimum distances between sites

Synchronous replication is designed to write data to all replicas simultaneously.
Therefore, all replicas should always have the same data all of the time. Asynchronous
replication writes data to the primary storage first, and then copies data to the replicas
at scheduled intervals.
Asynchronous replication isn’t a good choice for a solution that requires data in
multiple locations to be consistent, such as data from product inventory lists accessed
in different regions. Many geo-redundant replication services rely on asynchronous
replication due to the distances between data centers in multiple regions. In some
cases, business solutions work around the limitations of asynchronous replication. For
example, an online retailer may choose only to show inventory from their local regional
warehouse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

On-Premises versus Cloud

A

High availability through redundancy and replication is resource-intensive, especially
when configuring multiple hot or warm sites. For on-premises sites, provisioning the
storage devices and high-bandwidth, low-latency WAN links required between two
geographically dispersed hot sites could incur unaffordable costs. This cost is one of the
big drivers of cloud services, where local and geographic redundancy are built into the
system, if you trust the CSP to operate the cloud effectively. For example, in the cloud,
geo-redundancy replicates data or services between data centers physically located
in two different regions. Disasters that occur at the regional level, like earthquakes,
hurricanes, or floods, should not impact availability across multiple zones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Backups and Retention Policy

A
  • Short term retention
    • Version control and recovery from corruption/malware
  • Long term retention
    • Regulatory/business requirements
  • Recovery window
    • Recovery point objective (RPO)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

•Short term retention

A
  • Short term retention
    • Version control and recovery from

In the short term, files that change frequently might need retaining for version
control. Short-term retention is also important in recovering from malware
infection. Consider the scenario where a backup is made on Monday, a file is
infected with a virus on Tuesday, and when that file is backed up later on Tuesday,
the copy made on Monday is overwritten. This means that there is no good means
of restoring the uninfected version of the file. Short-term retention is determined by
how often the youngest media sets are overwritten.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

•Long term retention

A
  • Long term retention

* Regulatory/business requirements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

•Recovery window

A
  • Recovery window
    • Recovery point objective (RPO)

For these reasons, backups are kept back to certain points in time. As backups take up
a lot of space, and there is never limitless storage capacity, this introduces the need for
storage management routines to reduce the amount of data occupying backup storage
media while giving adequate coverage of the required recovery window. The recovery
window is determined by the recovery point objective (RPO), which is determined
through business continuity planning. Advanced backup software can prevent media
sets from being overwritten in line with the specified retention policy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Backup Typers

A

See slide or guide. there is a graphic.

Full
Incremental
Differential

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

•Snapshots

A
  • Snapshots
    • Feature of file system allowing open file copy
    • Volume Shadow Copy Service (VSS)
    • VM snapshots and checkpoints
  • Image-based backup
    • System images

Snapshots are a means of getting around the problem of open files. If the data that
you’re considering backing up is part of a database, such as SQL data or an Exchange
messaging system, then the data is probably being used all the time. Often copy-based
mechanisms will be unable to back up open files. Short of closing the files, and so too
the database, a copy-based system will not work. A snapshot is a point-in-time copy
of data maintained by the file system. A backup program can use the snapshot rather
than the live data to perform the backup. In Windows, snapshots are provided for on
NTFS volumes by the Volume Shadow Copy Service (VSS). They are also supported on
Sun’s ZFS file system, and under some enterprise distributions of Linux.

Virtual system managers can usually take snapshot or cloned copies of VMs. A
snapshot remains linked to the original VM, while a clone becomes a separate VM from
the point that the cloned image was made.
An image backup is made by duplicating an OS installation. This can be done either
from a physical hard disk or from a VM’s virtual hard disk. Imaging allows the system
to be redeployed quickly, without having to reinstall third-party software, patches, and
configuration settings. A system image should generally not contain any user data files,
as these will quickly become out of date.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

•Image-based backup

A
  • Image-based backup
    • System images

An image backup is made by duplicating an OS installation. This can be done either
from a physical hard disk or from a VM’s virtual hard disk. Imaging allows the system
to be redeployed quickly, without having to reinstall third-party software, patches, and
configuration settings. A system image should generally not contain any user data files,
as these will quickly become out of date.

32
Q

•Volume Shadow Copy Service (VSS)

A

In Windows, snapshots are provided for on

NTFS volumes by the Volume Shadow Copy Service (VSS).

33
Q

•VM snapshots and checkpoints

A

Virtual system managers can usually take snapshot or cloned copies of VMs. A
snapshot remains linked to the original VM, while a clone becomes a separate VM from
the point that the cloned image was made.

34
Q

Backup Storage Issues

A

Backup security
•Access control and encryption

Offsite storage
•Distance consideration
•Physical transfer
•Network/cloud backups

Online versus offline backups
•Speed of restore operations
•Risk to online backup data (offline take more time to get operational but offline offers better security

3-2-1 rule

35
Q

3-2-1 rule

A

The 3-2-1 rule states that you should have three copies of your data, across two media
types, with one copy held offline and offsite.

36
Q

Backup Media Types

A

Disk
•SOHO backups
•Lack enterprise-level capacity and manageability

Network attached storage (NAS)
•File-level/protocol-based access
•No offsite option

Tape
•Enterprise-level capacity and manageability

Storage area network (SAN) and cloud
•Block-level access to storage devices
•Highly configurable
•Mix storage technologies to implement performance tiers

37
Q

Disk

A

Disk
•SOHO backups
•Lack enterprise-level capacity and manageability

Individual removable hard drives are an excellent low-cost option for SOHO network
backups, but they do not have sufficient capacity or flexibility to be used within an
automated enterprise backup solution.

38
Q

Network attached storage (NAS)

A

Network attached storage (NAS)
•File-level/protocol-based access
•No offsite option

A network attached storage (NAS) appliance is a specially configured type of server
that makes RAID storage available over common network protocols, such as Windows
File Sharing (SMB) or FTP. A NAS appliance is accessed via an IP address and backup
takes place at file-level. A NAS can be another good option for SOHO backup, but as
a single device, it provides no offsite option. As it is normally kept online, it can be
vulnerable to cryptoransomware as well.

39
Q

Tape

A

Tape
•Enterprise-level capacity and manageability
- slow

Digital tape systems are a popular choice for institutions with multi-terabyte storage
requirements. Tape is very cost effective and, given a media rotation system, tapes can
be transported offsite. The latest generation of tape will store about 10-12 terabytes
per cartridge or up to about 30 TB with compression. The main drawback of tape is
that it is slow, compared to disk-based solutions, especially for restore operations.

40
Q

Storage area network (SAN) and cloud

A

Storage area network (SAN) and cloud
•Block-level access to storage devices
•Highly configurable
•Mix storage technologies to implement performance tiers

A RAID array or tape drive/autoloader can be provisioned as direct attached storage,
where a server hosts the backup devices, usually over serial attached SCSI (SAS).
Direct attached storage has limited scalability, so enterprise and cloud storage
solutions often use storage area networks (SAN) as a layer of abstraction between
the file system objects presented to servers and the configuration of the actual storage
media. Where NAS uses file-level access to storage, a SAN is based on block-level
addressing. A SAN can incorporate RAID arrays and tape systems within the same
network. SANs can achieve offsite storage through replication.

41
Q

Restoration Order

A

A complex facility such as a data
center or campus network must be reconstituted according to a carefully designed
order of restoration. If systems are brought back online in an uncontrolled way, there
is the serious risk of causing additional power problems or of causing problems in the
network, OS, or application layers because dependencies between different appliances
and servers have not been met.

  1. Power delivery systems
  2. Switch infrastructure then routing appliances and systems
  3. Network security appliances
  4. Critical network servers
  5. Backend and middleware and verify data integrity
  6. Front-end applications
  7. Client workstations and devices and client browser access
42
Q

Non-Persistence

A

Separate compute instance from data
•Snapshot/revert to known state
•Rollback to known configuration
•Live boot media

Provisioning
•Master image
•Automated build from template

Configuration validation

43
Q

Definition of Non-persistance

A

Separate compute instance from data
•Snapshot/revert to known state
•Rollback to known configuration
•Live boot media

When recovering systems, it may be necessary to ensure that any artifacts from
the disaster, such as malware or backdoors, are removed when reconstituting the
production environment. This can be facilitated in an environment designed for
nonpersistence.

Nonpersistence means that any given instance is completely static
in terms of processing function. Data is separated from the instance so that it can be
swapped out for an “as new” copy without suffering any configuration problems. There
are various mechanisms for ensuring nonpersistence:

44
Q

Provisioning

A

Provisioning
•Master image
•Automated build from template

When provisioning a new or replacement instance automatically, the automation
system may use one of two types of mastering instructions:
• Master image—this is the “gold” copy of a server instance, with the OS, applications,
and patches all installed and configured. This is faster than using a template, but
keeping the image up to date can involve more work than updating a template.
• Automated build from a template—similar to a master image, this is the build
instructions for an instance. Rather than storing a master image, the software may
build and provision an instance according to the template instructions.

45
Q

Configuration Validation

A

Another important process in automating resiliency strategies is to provide
configuration validation. This process ensures that a recovery solution is working
at each layer (hardware, network connectivity, data replication, and application). An
automation solution for incident and disaster recovery will have a dashboard of key
indicators and may be able to evaluate metrics such as compliance with RPO and RTO
from observed data.

46
Q

Configuration Management

A

Configuration management ensures that each component of ICT infrastructure is in
a trusted state that has not diverged from its documented properties. Change control
and change management reduce the risk that changes to these components could
cause service disruption.

  • Service assets
  • Configuration items (CIs)
    * Assets that require configuration management
  • Baseline configuration
  • Configuration management system (CMS)
    • Creating and updating diagrams
    • Workflows
    • Physical and logical network topologies
    • Network rack layouts
47
Q

Service Assets

A

Service assets are things, processes, or people that contribute to the delivery of an
IT service.

48
Q

Configuration Items (CIs)

A
  • Configuration items (CIs)
    * Assets that require configuration management

A Configuration Item (CI) is an asset that requires specific management procedures
for it to be used to deliver the service. Each CI must be identified by some sort of
label, ideally using a standard naming convention. CIs are defined by their attributes
and relationships, which are stored in a configuration management database
(CMDB).

49
Q

•Baseline configuration

A

baseline configuration is the template of settings that a device, VM instance, or
other CI was configured to, and that it should continue to match. You might also
record performance baselines, such as the throughput achieved by a server, for
comparison with monitored levels.

50
Q

•Configuration management system (CMS)

A
  • Configuration management system (CMS)
    • Creating and updating diagrams
    • Workflows
    • Physical and logical network topologies
    • Network rack layouts

A configuration management system (CMS) is the tools and databases that collect,
store, manage, update, and present information about CIs and their relationships.A small network might capture this information in spreadsheets and diagrams;
there are dedicated applications for enterprise CMS.

Diagrams are the best way to capture the complex relationships between network
elements. Diagrams can be used to show how CIs are involved in business
workflows, logical (IP) and physical network topologies, and network rack layouts.
Remember, it is not sufficient simply to create the diagram, you must also keep the
diagram up to date.

51
Q

Asset Management

A

Inventory/asset management database

Asset identification and standard naming conventions
•Barcodes and RFID tags
•Standard naming conventions for asset IDs
•Attribute fields and tags

Internet protocol (IP) schema
•Static allocation versus DHCP ranges
•IP address management (IPAM) software suites
52
Q

Assess Management (definition)

A

An asset management process tracks all the organization’s critical systems,
components, devices, and other objects of value in an inventory. It also involves
collecting and analyzing information about these assets so that personnel can make
more informed changes or otherwise work with assets to achieve business goals.

53
Q

Inventory/asset management database

A

There are many software suites and associated hardware solutions available for
tracking and managing assets. An asset management database can be configured to
store as much or as little information as is deemed necessary, though typical data
would be type, model, serial number, asset ID, location, user(s), value, and service
information.

54
Q

Asset identification and standard naming conventions

A

Asset identification and standard naming conventions
•Barcodes and RFID tags
•Standard naming conventions for asset IDs
•Attribute fields and tags

Tangible assets can be identified using a barcode label or radio frequency ID (RFID) tag
attached to the device (or more simply, using an identification number). An RFID tag is
a chip programmed with asset data. When in range of a scanner, the chip activates and
signals the scanner. The scanner alerts management software to update the device’s
location. As well as asset tracking, this allows the management software to track the
location of the device, making theft more difficult.
A standard naming convention for hardware assets, and for digital assets such as
accounts and virtual machines, makes the environment more consistent. This means
that errors are easier to spot and that it is easier to automate through scripting. The
naming strategy should allow administrators to identify the type and function of any
particular resource or location at any point in the CMDB or network directory. Each
label should conform to rules for host and DNS names (support.microsoft.com/en-us/
help/909264/naming-conventions-in-active-directory-for-computers-domains-sitesand).
As well as an ID attribute, the location and function of tangible and digital assets
can be recorded using attribute tags and fields or DNS CNAME and TXT resource
records.

55
Q

Internet protocol (IP) schema

A
Internet protocol (IP) schema
•Static allocation versus DHCP ranges
•IP address management (IPAM) software suites

The division of the IP address space into subnets should be carefully planned and
documented in an Internet Protocol (IP) schema. Using a consistent addressing
methodology makes it easier to apply firewall access control lists (ACLs) and perform
security monitoring (tools.cisco.com/security/center/resources/security_ip_addressing.
html). It also makes configuration errors less likely and easier to detect. Within
each subnet, the schema should identify IP addresses reserved for manual or static
allocation versus DHCP address pools. IP address management (IPAM) software
suites can be used to monitor IP usage.

56
Q

Change Control and Change Management

A
Change control
•Assesswhethera change shouldbemade
•Classifying change (reactive, proactive, risk)
•Request for Change (RFC)
•Change Advisory Board (CAB)

Change management
•Ensurechanges are appliedwithminimum disruption
•Rollback plan
- Every change should be
accompanied by a rollback (or remediation) plan, so that the change can be reversed
if it has harmful or unforeseen consequences.

57
Q

Site Resiliency

A

Alternate processing sites/recovery sites
•Provideredundancy for damage to resources stored on the primary site
•Failover to alternate processing site (or system)

Hot site
•Instantaneous failover

Warm site
•Some delay or manual configuration before failover occurs

Cold site
•Significant delay and configuration before failover can occur

58
Q

Hot Site

A

Hot site
•Instantaneous failover

A hot site can failover almost immediately. It generally means that the site is
already within the organization’s ownership and is ready to deploy. For example,
a hot site could consist of a building with operational computer equipment that is
kept updated with a live data set.

59
Q

Warm Site

A

Warm site
•Some delay or manual configuration before failover occurs

A warm site could be similar, but with the requirement that the latest data set will
need to be loaded.

60
Q

Cold Site

A

Cold site
•Significant delay and configuration before failover can occur

A cold site takes longer to set up. A cold site may be an empty building with a lease
agreement in place to install whatever equipment is required when necessary.

61
Q

Diversity and Defense in Depth

A

Layered security and defense in depth

Technology and control diversity
•Provision different classes and types of controls
•Mix technical, administrative, and physical controls
•Deploy controls to prevent, deter, detect, and correct

Vendor diversity
•Use more than one supplier

Crypto diversity

62
Q

Layered security and defense in depth

A

Layered security is typically seen as improving cybersecurity resiliency because
it provides defense in depth. The idea is that to fully compromise a system, the
attacker must get past multiple security controls, providing control diversity. These
layers reduce the potential attack surface and make it much more likely that an attack
will be deterred or prevented, or at least detected and then prevented by manual
intervention.

63
Q

Technology and control diversity

A

Technology and control diversity
•Provision different classes and types of controls
•Mix technical, administrative, and physical controls
•Deploy controls to prevent, deter, detect, and correct

Allied with defense in depth is the concept of security through (or with) diversity.
Technology diversity refers to environments that are a mix of operating systems,
applications, coding languages, virtualization solutions, and so on. Control diversity
means that the layers of controls should combine different classes of technical and
administrative controls with the range of control functions: prevent, detect, correct,
and deter.
Consider the scenario where Alan from marketing is sent a USB stick containing
designs for a new billboard campaign from an agency. Without defense in depth, Alan
might find the USB stick on his desk in the morning, plug it into his laptop without
much thought, and from that point is potentially vulnerable to compromise. There are
many opportunities in this scenario for an attacker to tamper with the media: at the
agency, in the post, or at Alan’s desk.
Defense in depth, established by deploying a diverse range of security controls, could
mitigate the numerous risks inherent in this scenario:
• User training (administrative control) could ensure that the media is not left
unattended on a desk and is not inserted into a computer system without scanning
it first.
• Endpoint security (technical control) on the laptop could scan the media for
malware or block access automatically.
• Security locks inserted into USB ports (physical control) on the laptop could prevent
attachment of media without requesting a key, allowing authorization checks to be
performed first.
• Permissions restricting Alan’s user account (technical control) could prevent the
malware from executing successfully.
• The use of encrypted and digitally signed media (technical control) could prevent or
identify an attempt to tamper with it.
• If the laptop were compromised, intrusion detection and logging/alerting systems
(technical control) could detect and prevent the malware spreading on the network.

64
Q

Vendor diversity

A

Vendor diversity
•Use more than one supplier

As well as deploying multiple types of controls, you should consider the advantages
of leveraging vendor diversity. Vendor diversity means that security controls are sourced from multiple suppliers. A single vendor solution is a tempting choice for many organizations, as it provides interoperability and can reduce training and support costs.
Some disadvantages could include the following:
• Not obtaining best-in-class performance—one vendor might provide an effective
firewall solution, but the bundled malware scanning is found to be less effective.
• Less complex attack surface—a single vulnerability in a supplier’s code could put
multiple appliances at risk in a single vendor solution. A threat actor will be able to
identify controls and possible weaknesses more easily.
• Less innovation—dependence on a single vendor might make the organization
invest too much trust in that vendor’s solutions and less willing to research and test
new approaches.

65
Q

Crypto diversity

A

This concept can be extended to the selection of algorithms and implementations
of cryptography. Adoption of methods such as blockchain-based IAM (ibm.com/
blogs/blockchain/2018/10/decentralized-identity-an-alternative-to-password-basedauthentication)
or selecting ChaCha in place of AES as a preferred cipher suite
(blog.cloudflare.com/it-takes-two-to-chacha-poly) forces threat actors to develop
new attack methods.

66
Q

Deception and Disruption Strategies

A

Asymmetry of attack and defense

Active defense

Fake/decoy assets
•Honeypots, honeynets, and honeyfiles
•Breadcrumbs

Disruption strategies
•Bogus DNS records
•Decoy directories and resources
•Port spoofing to return fake telemetry/monitoring data
•DNS sinkholes
67
Q

Asymmetry of attack and defense

A

The practice of cybersecurity is often described as asymmetric warfare; the defenders
have to win every encounter and be ready all the time. The threat actors can choose
when to attack and only have to win once. Some cybersecurity tactics aim to reduce
that asymmetry by increasing the attack cost. This means that a threat actor has to
commit more resources to even plan an attack.

68
Q

Active defense

A

Active defense means an engagement with the adversary, but this can be interpreted
in several different ways. One type of active defense involves the deployment of decoy
assets to act as lures or bait. It is much easier to detect intrusions when an attacker
interacts with a decoy resource, because you can precisely control baseline traffic and
normal behavior in a way that is more difficult to do for production assets.

69
Q

Fake/decoy assets

A

Fake/decoy assets
•Honeypots, honeynets, and honeyfiles
•Breadcrumbs
•Disruption strategies

70
Q

Honeypot and honeynet

A

A honeypot is a computer system set up to attract threat actors, with the intention
of analyzing attack strategies and tools, to provide early warnings of attack attempts,
or possibly as a decoy to divert attention from actual computer systems. Another use
is to detect internal fraud, snooping, and malpractice. A honeynet is an entire decoy
network. This may be set up as an actual network or simulated using an emulator.

71
Q

Honeyfile

A

A honeypot or honeynet can be combined with the concept of a honeyfile, which is
convincingly useful, but actually fake, data. This honeyfile can be made trackable, so
that when a threat actor successfully exfiltrates it, the attempts to resuse or exploit it
can be traced.

72
Q

Disruption strategies

A
Disruption strategies
•Bogus DNS records
•Decoy directories and resources
•Port spoofing to return fake telemetry/monitoring data
•DNS sinkholes
73
Q

•Bogus DNS records

A

Using bogus DNS entries to list multiple hosts that do not exist.

74
Q

•Decoy directories and resources

A

Configuring a web server with multiple decoy directories or dynamically generated
pages to slow down scanning.

75
Q

•Port spoofing to return fake telemetry/monitoring data

A

Using port triggering or spoofing to return fake telemetry data when a host detects
port scanning activity. This will result in multiple ports being falsely reported as
open and will slow down the scan. Telemetry can refer to any type of measurement
or data returned by remote scanning. Similar fake telemetry could be used to report
IP addresses as up when they are not, for instance.

76
Q

•DNS sinkholes

A

Using a DNS sinkhole to route suspect traffic to a different network, such as a
honeynet, where it can be analyzed.