1.3 Availability vs consistency Flashcards
What are the three guarantees in the CAP theorem?
Consistency, Availability, Partition Tolerance
These guarantees define the limits of what can be achieved in distributed systems.
What does consistency mean in the context of the CAP theorem?
Every read receives the most recent write or an error.
Consistency ensures that all clients see the same data at the same time.
Define availability in the CAP theorem.
Every request receives a response, without guarantee that it contains the most recent version of the information.
Availability focuses on ensuring responses are provided even if they are not the latest data.
What is partition tolerance?
The system continues to operate despite arbitrary partitioning due to network failures.
This is crucial for systems that cannot afford to go offline during network issues.
What is CP in distributed systems?
Consistency and Partition Tolerance.
In CP systems, availability may be compromised to ensure data consistency.
What is AP in distributed systems?
Availability and Partition Tolerance.
In AP systems, consistency may be sacrificed to ensure that the system remains operational.
What is weak consistency?
After a write, reads may or may not see it. A best effort approach is taken.
Common in systems like memcached and suitable for real-time applications.
Define eventual consistency.
After a write, reads will eventually see it, typically within milliseconds.
This approach is often used in systems like DNS and email.
What is strong consistency?
After a write, reads will see it. Data is replicated synchronously.
Strong consistency is ideal for systems requiring transactions.
What is active-passive failover?
Heartbeats are sent between the active and passive server. If interrupted, the passive server takes over.
The length of downtime depends on the state of the passive server.
What is active-active failover?
Both servers manage traffic, spreading the load between them.
Requires DNS or application logic to know about both servers.
What are the disadvantages of failover?
Adds more hardware and complexity, potential data loss if the active system fails before replication.
These factors can complicate system design.
How is availability quantified?
Availability is measured in uptime as a percentage, often described in terms of ‘nines’.
For example, 99.99% availability is referred to as four nines.
What is the acceptable downtime for 99.9% availability?
8h 45min 57s per year.
This level of availability is often referred to as three nines.
What happens to overall availability when components are in sequence?
Overall availability decreases.
The formula is Availability (Total) = Availability (Foo) * Availability (Bar).
What happens to overall availability when components are in parallel?
Overall availability increases.
The formula is Availability (Total) = 1 - (1 - Availability (Foo)) * (1 - Availability (Bar)).
What does the CAP theorem state about distributed systems?
You can only achieve two of the three guarantees: Consistency, Availability, and Partition Tolerance.
One must be sacrificed when designing distributed systems.
What is the main flaw in the service model of ‘Remembrance Inc’?
The system is not consistent; updates may not be synchronized between the two operators.
This leads to potential misinformation for customers.
What solution was proposed to fix the consistency problem in ‘Remembrance Inc’?
Both operators must inform each other of updates before completing calls.
This ensures that both have the latest information.
What issue arises when one operator of ‘Remembrance Inc’ does not report to work?
Availability problem; updates cannot be completed if one person is absent.
This can lead to customer dissatisfaction.
What is the proposed solution for maintaining both consistency and availability?
If one operator is unavailable, the other sends an email to update them.
This allows for updates while maintaining service availability.
What is the final flaw identified in the system of ‘Remembrance Inc’?
The system is not partition tolerant; if communication fails between operators, they cannot update each other.
This compromises availability during conflicts.
What is eventual consistency with a run-around clerk?
A clerk updates notebooks in the background, allowing updates to proceed without blocking.
This may lead to temporary inconsistencies but improves system performance.
What is the significance of the CAP theorem in distributed system design?
It helps in understanding trade-offs between consistency, availability, and partition tolerance.
This is crucial for making informed decisions in system architecture.
What is Partition Tolerance in distributed systems?
The system will continue to function when network partitions occur.
True or False: Object Oriented Programming is the same as Network Programming.
False
What is a common fallacy of distributed computing regarding networks?
Networks are reliable.
What must be tolerated in a distributed system due to network unreliability?
Partitions
According to the CAP theorem, what two options are available when a partition occurs?
Consistency and Availability
What does CP stand for in the context of the CAP theorem?
Consistency/Partition Tolerance
What happens in a CP system when a partition occurs?
Wait for a response from the partitioned node, which could result in a timeout error.
What is the main choice in a CP system regarding business requirements?
Choose Consistency over Availability when atomic reads and writes are needed.
What does AP stand for in the context of the CAP theorem?
Availability/Partition Tolerance
What does an AP system return when a partition occurs?
The most recent version of the data, which could be stale.
When should you choose Availability over Consistency?
When business requirements allow flexibility around data synchronization.
What is a compelling reason to choose Availability?
When the system needs to continue to function despite external errors.
What type of decision is the choice between Consistency and Availability?
A software trade-off.
What must be acknowledged about network outages?
They are a fact of life and occur unexpectedly.
What are the advantages of building distributed systems?
Many advantages, but also adds complexity.
What is vital for the success of your application in distributed systems?
Understanding the trade-offs available in the face of network errors.
What could happen if the trade-offs between Consistency and Availability are not handled correctly?
Your application could fail before deployment.
What is consistency in distributed systems and what does the CAP theorem state about it?
Consistency in distributed systems refers to how data synchronization is handled when there are multiple copies of the same data:
CAP Theorem Definition:
Consistency means every read operation receives either the most recent write or an error
It’s one of the three key components of the CAP theorem (Consistency, Availability, Partition Tolerance)
Systems must balance these properties as you can only guarantee two out of three
Implementation Considerations:
Multiple data copies require synchronization strategies
Different consistency models offer different trade-offs
Choice of consistency model impacts system behavior and user experience
Must consider network latency and partition scenarios
Business requirements often dictate consistency needs
What is weak consistency and when should it be used?
Weak consistency is the most relaxed consistency model where:
Core Characteristics:
Reads may or may not reflect the most recent write
No guarantees about when data will be consistent
Best-effort approach to data synchronization
Fastest performance among consistency models
Lowest consistency guarantees
Use Cases:
Real-time communication systems (VoIP)
Video chat applications
Multiplayer games
Systems where temporary data inconsistency is acceptable
Applications where speed is more important than accuracy
Example Scenario:
Phone call with lost reception
When connection resumes, missed audio is not replayed
System prioritizes real-time communication over data consistency
Implementation Example:
Memcached uses this model
Provides high performance
Sacrifices data consistency for speed
No guarantee of data synchronization timing
What is eventual consistency and how does it work?
Eventual consistency provides a middle-ground approach where:
Core Characteristics:
Data will become consistent over time
Reads will eventually reflect all completed writes
Typically achieves consistency within milliseconds
Data replication happens asynchronously
Better performance than strong consistency
Implementation Details:
Updates propagate gradually through the system
No immediate synchronization requirement
Systems can continue operating during network partitions
May serve stale data temporarily
Conflicts resolved through various mechanisms (vector clocks, etc.)
Common Applications:
DNS systems
Email systems
Distributed databases
Social media platforms
Content delivery networks
Advantages:
Higher availability
Better scalability
Lower latency
Continues functioning during network partitions
Good for systems that don’t require immediate consistency
What is strong consistency and what are its implications?
Strong consistency is the most rigid consistency model that:
Core Characteristics:
All reads reflect the most recent write
Data is replicated synchronously
Provides immediate consistency across all nodes
Highest consistency guarantees
Most impactful on performance
Implementation Requirements:
Synchronous replication
Coordination between all nodes
Consensus protocols
Transaction management
Conflict prevention mechanisms
Common Applications:
File systems
Relational databases (RDBMS)
Banking systems
Financial transactions
Systems requiring ACID properties
Trade-offs:
Highest consistency guarantees
Lower availability during partitions
Higher latency for operations
More complex implementation
Resource intensive
What are the key availability patterns in distributed systems?
Availability patterns focus on ensuring system uptime through:
Primary Approaches:
Fail-over:
Systems switch to backup when primary fails
Requires redundant hardware
Can be active-passive or active-active
Requires heartbeat monitoring
Needs failover automation
Replication:
Data copied across multiple nodes
Can be synchronous or asynchronous
Supports different consistency models
Provides redundancy
Enables load distribution
Implementation Considerations:
Hardware requirements
Network configuration
Data synchronization
Monitoring systems
Recovery procedures
What is active-passive failover and how does it work?
Active-passive failover (also known as master-slave failover) is a high availability pattern that:
Core Components:
Active server handling all traffic
Passive server on standby
Heartbeat mechanism between servers
IP address takeover capability
Monitoring system
Operational Details:
Normal Operation:
Active server handles all requests
Passive server maintains standby state
Regular heartbeat checks between servers
Continuous data synchronization
System monitoring active
Failover Process:
Heartbeat interruption detected
Passive server activates
IP address migration occurs
Services resume on passive server
System alerts generated
Standby Modes:
Hot Standby:
Passive server running and ready
Minimal startup time
Higher resource usage
Faster failover
More expensive
Cold Standby:
Passive server inactive
Longer startup time
Lower resource usage
Slower failover
More cost-effective
Disadvantages:
Additional hardware costs
Complex configuration
Potential data loss during failover
Resource underutilization
Higher maintenance overhead
How does active-active failover work and what are its characteristics?
Active-active failover (also known as master-master failover) is a more complex availability pattern where:
Core Characteristics:
Multiple active servers
Load distribution across servers
Simultaneous traffic handling
Synchronized data states
No standby resources
Implementation Requirements:
Public-Facing Systems:
DNS configuration for multiple IPs
Load balancer configuration
Health checking mechanisms
Traffic distribution rules
Failover procedures
Internal Systems:
Application awareness of multiple servers
Connection management
Load distribution logic
State synchronization
Conflict resolution
Advantages:
Better resource utilization
Higher throughput capacity
Natural load balancing
Improved fault tolerance
Easier maintenance procedures
Challenges:
Complex data synchronization
Potential consistency issues
More sophisticated monitoring needed
Higher implementation complexity
Increased operational overhead
What are the key considerations for availability percentages and their implications?
Availability measurements and calculations involve:
Availability Metrics:
Three Nines (99.9%):
8h 45min 57s downtime per year
43m 49.7s downtime per month
10m 4.8s downtime per week
1m 26.4s downtime per day
Suitable for non-critical systems
Four Nines (99.99%):
52min 35.7s downtime per year
4m 23s downtime per month
1m 5s downtime per week
8.6s downtime per day
Required for critical systems
Calculation Patterns:
Sequential Systems:
Availability decreases with each component
Formula: A(total) = A(component1) * A(component2)
Example: Two 99.9% components = 99.8% total
More components reduce overall availability
Requires higher component reliability
Parallel Systems:
Availability increases with redundancy
Formula: A(total) = 1 - (1-A(comp1)) * (1-A(comp2))
Example: Two 99.9% components = 99.9999% total
Better fault tolerance
Higher overall availability
Implementation Considerations:
Cost vs availability requirements
System architecture decisions
Component reliability needs
Monitoring and alerting thresholds
Maintenance windows impact