Topic 7B Flashcards
continuity of operations planning (COOP)
involves developing processes and procedures to ensure critical business functions can continue during and after a disruption.
Key elements of a COOP plan include identifying critical business functions, establishing priorities, and determining the resources needed to support these functions.
Orgs may consider remote work options for their employees
High availability (HA) clustering
ensuring systems remain operational and accessible with minimal downtime.
It involves designing and implementing hardware components, servers, networking, datacenters, and physical locations for fault tolerance and redundancy.
For a critical system, availability is described using the “nines” term, such as two-nines (99%) up to five- or six-nines (99.9999%)
also means that a system cancope with rapid growth in demand. Should be able to scale resources and power of existing resources.
business continuity (BC)
takes a broader approach, considering not only the continuity of critical functions but also the overall resilience and recovery of the entire organization.
Business continuity planning includes the assessment of risks, the development of strategies to mitigate those risks, and the creation of plans to maintain or restore business operations in the face of various threats.
This may involve addressing supply chain management, employee safety and communication, legal and regulatory compliance, and reputation management
Capacity planning
a critical process in which organizations assess their current and future resource requirements to ensure they can efficiently meet their business objectives.
This process involves evaluating and forecasting the necessary resources in terms of people, technology, and infrastructure to support anticipated growth, changes in demand, or other factors that may impact operations.
may involve evaluating workforce productivity, analyzing staffing levels, and identifying potential skills gaps
Things that put CAPACITY PLANNING at risk
Lack of cross-training or succession planning can create dependency on specific individuals, increasing vulnerability to disruptions.
Cross-Training—Requires employees to develop skills and knowledge outside their primary roles to mitigate the risk of relying heavily on specific individuals or teams.
Remote Work Plans—Outline strategies for employees to work effectively outside the traditional office environment. Remote work plans define communication channels, technology requirements, and expectations for remote work arrangements.
Alternative Reporting Structures—Describe backup or temporary reporting relationships to reduce the risk associated with single points of failure in management or decision-making.
fault tolerant
A system that can experience failures and continue to provide the same (or nearly the same) level of service
often achieved by provisioning redundancy for critical components and single points of failure.
Site resiliency is described as hot, warm, or cold:
A hot site can failover almost immediately. It generally means the site is within the organization’s ownership and ready to deploy. For example, a hot site could consist of a building with operational computer equipment kept updated with a live data set.
A warm site could be similar, but with the requirement that the latest data set needs to be loaded.
A cold site takes longer to set up. A cold site may be an empty building with a lease agreement in place to install whatever equipment is required when necessary.
Geographic dispersion
refers to the distribution of recovery sites across different geographic locations for disaster recovery (DR) purposes.
Cloud as Disaster Recovery (DR)
Cost efficiency plays a significant role, as cloud providers offer more affordable redundancy and backup options due to their economies of scale.
Simplified management is another critical factor, with cloud providers offering tools and services that reduce the complexity of managing redundant infrastructure
Load testing
incorporates specialized software tools to validate a system’s performance under expected or peak loads and identify bottlenecks or scalability issues.
Clustering
A load balancing technique where a group of servers are configured as a unit and work together to provide network services.
failover
A technique that ensures a redundant component, device, or application can quickly and efficiently take over the functionality of an asset that has failed.
Common Address Redundancy Protocol (CARP)
enabling the active node to “own” the virtual IP and respond to connections. The redundancy protocol also implements a heartbeat mechanism to allow failover to the passive node if the active one should suffer a fault.
power distribution unit (PDU)
An advanced strip socket that provides filtered output voltage. A managed unit supports remote administration.
provide protection against spikes, surges, and under-voltage events; and integrate with uninterruptible power supplies (UPSs).
Managed PDUs support remote power monitoring functions, such as reporting load and status, switching power to a socket on and off, or switching sockets on in a particular sequence.
uninterruptible power supply (UPS)
In its simplest form, a UPS comprises a bank of batteries and their charging circuit plus an inverter to generate AC voltage from the DC voltage supplied by the batteries.
The UPS allows sufficient time to failover to an alternative power source, such as a standby generator. If there is no secondary power source, a UPS will allow the administrator to at least shut down the server or appliance properly