3.4 Explain the importance of resilience and recovery in security architecture Flashcards
! High Availability
Ensures continuous service with minimal downtime using load balancing, clustering, redundancy, and multi-cloud strategies.
-
Key Differences:
- Load Balancing: Distributes tasks for optimal performance.
- Clustering: Active teamwork—systems share tasks and support each other. Provides failover
- Redundancy: Passive backup—used only during failure. Provides fault tolerance
Load Balancing is about distributing requests across multiple servers for optimized performance.
Clustering is about connecting servers to act as a single entity, ensuring redundancy and fault tolerance.
Use Clustering when you need high availability, shared resources, or when servers must work together on complex tasks.
Use Load Balancing when managing traffic for stateless, independent servers, especially in applications where adding or removing servers as traffic changes is a priority.
Capacity Planning
Prepares for future demands efficiently.
-
People: Forecast needs and ensure the right skills.
- Example: Hire seasonal staff.
-
Technology: Plan for scalability and future use.
- Example: Handle traffic spikes on platforms.
-
Infrastructure: Optimize spaces and utilities.
- Example: Allocate data center space.
-
Processes: Improve workflows and efficiency.
- Example: Automate onboarding.
Powering Data Centers
- Line Conditioners: Stabilize voltage and filter fluctuations; not suitable for major power losses.
- Uninterruptible Power Supplies (UPS): Provide emergency power with battery backup for 15-60 minutes and offer line conditioning.
- Generators: Types: Portable gas-engine, permanently installed, battery-inverter.
- Power Distribution Centers (PDC): Receive signal from the source and distribute power with circuit protection, monitoring, and load balancing; integrate with UPS and generators.
Data Backups
- Onsite vs. Offsite Backups
- Backup - full backup
- Differential Backup: Backs up all data changed since the last full backup.
- Incremental Backup: Backs up data changed since the last full or incremental backup.
- Snapshots: Capture point-in-time data, storing only changes to save space. Not a real backup
- Replication: Copies data in real-time
Continuity of Operations Plan (COOP)
- Business Continuity Plan (BCP): INCIDENT. Involves preventative actions and recovery steps
- Disaster Recovery Plan (DRP): DISASTER. Subset of BC Plan, focuses on quick recovery from disasters (e.g., floods, fires).
Business Continuity Committee - representatives from various departments (IT, Legal, Security, Communications, etc.)
Redundant Site Considerations
Types of Continuity Locations:
- Hot Site: Continuously running; instant switchover
- Warm Site: Partially equipped; ready in days
- Cold Site: Minimal setup; ready in 1-2 months
- Mobile Site: Portable units (hot, warm, or cold); flexible deployment.
- Virtual Sites: - hot, warm, or cold
Resilience and Recovery Testing
General
- Resilience Testing: Checks if systems stay strong during disruptions with exercises
- Recovery Testing: Tests the system’s ability to restore operations after disruptions
Types
- Tabletop Exercises: Scenario-based discussions among stakeholders to assess preparedness, identify gaps, and promote teamwork.
- Failover Tests: Controlled transition from primary to backup components
- Simulations: Virtual scenarios to test real-time responses, assess performance, and provide feedback for improvement. Red and blue teams involved
- Parallel Processing: Runs primary and backup systems simultaneously to test stability and ensure no disruptions during failures.