1.3 Availability vs consistency Flashcards

1
Q

What are the three guarantees in the CAP theorem?

A

Consistency, Availability, Partition Tolerance

These guarantees define the limits of what can be achieved in distributed systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does consistency mean in the context of the CAP theorem?

A

Every read receives the most recent write or an error.

Consistency ensures that all clients see the same data at the same time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define availability in the CAP theorem.

A

Every request receives a response, without guarantee that it contains the most recent version of the information.

Availability focuses on ensuring responses are provided even if they are not the latest data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is partition tolerance?

A

The system continues to operate despite arbitrary partitioning due to network failures.

This is crucial for systems that cannot afford to go offline during network issues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is CP in distributed systems?

A

Consistency and Partition Tolerance.

In CP systems, availability may be compromised to ensure data consistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is AP in distributed systems?

A

Availability and Partition Tolerance.

In AP systems, consistency may be sacrificed to ensure that the system remains operational.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is weak consistency?

A

After a write, reads may or may not see it. A best effort approach is taken.

Common in systems like memcached and suitable for real-time applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define eventual consistency.

A

After a write, reads will eventually see it, typically within milliseconds.

This approach is often used in systems like DNS and email.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is strong consistency?

A

After a write, reads will see it. Data is replicated synchronously.

Strong consistency is ideal for systems requiring transactions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is active-passive failover?

A

Heartbeats are sent between the active and passive server. If interrupted, the passive server takes over.

The length of downtime depends on the state of the passive server.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is active-active failover?

A

Both servers manage traffic, spreading the load between them.

Requires DNS or application logic to know about both servers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the disadvantages of failover?

A

Adds more hardware and complexity, potential data loss if the active system fails before replication.

These factors can complicate system design.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is availability quantified?

A

Availability is measured in uptime as a percentage, often described in terms of ‘nines’.

For example, 99.99% availability is referred to as four nines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the acceptable downtime for 99.9% availability?

A

8h 45min 57s per year.

This level of availability is often referred to as three nines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What happens to overall availability when components are in sequence?

A

Overall availability decreases.

The formula is Availability (Total) = Availability (Foo) * Availability (Bar).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What happens to overall availability when components are in parallel?

A

Overall availability increases.

The formula is Availability (Total) = 1 - (1 - Availability (Foo)) * (1 - Availability (Bar)).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does the CAP theorem state about distributed systems?

A

You can only achieve two of the three guarantees: Consistency, Availability, and Partition Tolerance.

One must be sacrificed when designing distributed systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the main flaw in the service model of ‘Remembrance Inc’?

A

The system is not consistent; updates may not be synchronized between the two operators.

This leads to potential misinformation for customers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What solution was proposed to fix the consistency problem in ‘Remembrance Inc’?

A

Both operators must inform each other of updates before completing calls.

This ensures that both have the latest information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What issue arises when one operator of ‘Remembrance Inc’ does not report to work?

A

Availability problem; updates cannot be completed if one person is absent.

This can lead to customer dissatisfaction.

21
Q

What is the proposed solution for maintaining both consistency and availability?

A

If one operator is unavailable, the other sends an email to update them.

This allows for updates while maintaining service availability.

22
Q

What is the final flaw identified in the system of ‘Remembrance Inc’?

A

The system is not partition tolerant; if communication fails between operators, they cannot update each other.

This compromises availability during conflicts.

23
Q

What is eventual consistency with a run-around clerk?

A

A clerk updates notebooks in the background, allowing updates to proceed without blocking.

This may lead to temporary inconsistencies but improves system performance.

24
Q

What is the significance of the CAP theorem in distributed system design?

A

It helps in understanding trade-offs between consistency, availability, and partition tolerance.

This is crucial for making informed decisions in system architecture.

25
Q

What is Partition Tolerance in distributed systems?

A

The system will continue to function when network partitions occur.

26
Q

True or False: Object Oriented Programming is the same as Network Programming.

27
Q

What is a common fallacy of distributed computing regarding networks?

A

Networks are reliable.

28
Q

What must be tolerated in a distributed system due to network unreliability?

A

Partitions

29
Q

According to the CAP theorem, what two options are available when a partition occurs?

A

Consistency and Availability

30
Q

What does CP stand for in the context of the CAP theorem?

A

Consistency/Partition Tolerance

31
Q

What happens in a CP system when a partition occurs?

A

Wait for a response from the partitioned node, which could result in a timeout error.

32
Q

What is the main choice in a CP system regarding business requirements?

A

Choose Consistency over Availability when atomic reads and writes are needed.

33
Q

What does AP stand for in the context of the CAP theorem?

A

Availability/Partition Tolerance

34
Q

What does an AP system return when a partition occurs?

A

The most recent version of the data, which could be stale.

35
Q

When should you choose Availability over Consistency?

A

When business requirements allow flexibility around data synchronization.

36
Q

What is a compelling reason to choose Availability?

A

When the system needs to continue to function despite external errors.

37
Q

What type of decision is the choice between Consistency and Availability?

A

A software trade-off.

38
Q

What must be acknowledged about network outages?

A

They are a fact of life and occur unexpectedly.

39
Q

What are the advantages of building distributed systems?

A

Many advantages, but also adds complexity.

40
Q

What is vital for the success of your application in distributed systems?

A

Understanding the trade-offs available in the face of network errors.

41
Q

What could happen if the trade-offs between Consistency and Availability are not handled correctly?

A

Your application could fail before deployment.

42
Q

What is consistency in distributed systems and what does the CAP theorem state about it?

A

Consistency in distributed systems refers to how data synchronization is handled when there are multiple copies of the same data:

CAP Theorem Definition:

Consistency means every read operation receives either the most recent write or an error
It’s one of the three key components of the CAP theorem (Consistency, Availability, Partition Tolerance)
Systems must balance these properties as you can only guarantee two out of three

Implementation Considerations:

Multiple data copies require synchronization strategies
Different consistency models offer different trade-offs
Choice of consistency model impacts system behavior and user experience
Must consider network latency and partition scenarios
Business requirements often dictate consistency needs

43
Q

What is weak consistency and when should it be used?

A

Weak consistency is the most relaxed consistency model where:

Core Characteristics:

Reads may or may not reflect the most recent write
No guarantees about when data will be consistent
Best-effort approach to data synchronization
Fastest performance among consistency models
Lowest consistency guarantees

Use Cases:

Real-time communication systems (VoIP)
Video chat applications
Multiplayer games
Systems where temporary data inconsistency is acceptable
Applications where speed is more important than accuracy

Example Scenario:

Phone call with lost reception
When connection resumes, missed audio is not replayed
System prioritizes real-time communication over data consistency

Implementation Example:

Memcached uses this model
Provides high performance
Sacrifices data consistency for speed
No guarantee of data synchronization timing

44
Q

What is eventual consistency and how does it work?

A

Eventual consistency provides a middle-ground approach where:

Core Characteristics:

Data will become consistent over time
Reads will eventually reflect all completed writes
Typically achieves consistency within milliseconds
Data replication happens asynchronously
Better performance than strong consistency

Implementation Details:

Updates propagate gradually through the system
No immediate synchronization requirement
Systems can continue operating during network partitions
May serve stale data temporarily
Conflicts resolved through various mechanisms (vector clocks, etc.)

Common Applications:

DNS systems
Email systems
Distributed databases
Social media platforms
Content delivery networks

Advantages:

Higher availability
Better scalability
Lower latency
Continues functioning during network partitions
Good for systems that don’t require immediate consistency

45
Q

What is strong consistency and what are its implications?

A

Strong consistency is the most rigid consistency model that:

Core Characteristics:

All reads reflect the most recent write
Data is replicated synchronously
Provides immediate consistency across all nodes
Highest consistency guarantees
Most impactful on performance

Implementation Requirements:

Synchronous replication
Coordination between all nodes
Consensus protocols
Transaction management
Conflict prevention mechanisms

Common Applications:

File systems
Relational databases (RDBMS)
Banking systems
Financial transactions
Systems requiring ACID properties

Trade-offs:

Highest consistency guarantees
Lower availability during partitions
Higher latency for operations
More complex implementation
Resource intensive

46
Q

What are the key availability patterns in distributed systems?

A

Availability patterns focus on ensuring system uptime through:

Primary Approaches:

Fail-over:

Systems switch to backup when primary fails
Requires redundant hardware
Can be active-passive or active-active
Requires heartbeat monitoring
Needs failover automation

Replication:

Data copied across multiple nodes
Can be synchronous or asynchronous
Supports different consistency models
Provides redundancy
Enables load distribution

Implementation Considerations:

Hardware requirements
Network configuration
Data synchronization
Monitoring systems
Recovery procedures

47
Q

What is active-passive failover and how does it work?

A

Active-passive failover (also known as master-slave failover) is a high availability pattern that:

Core Components:

Active server handling all traffic
Passive server on standby
Heartbeat mechanism between servers
IP address takeover capability
Monitoring system

Operational Details:

Normal Operation:

Active server handles all requests
Passive server maintains standby state
Regular heartbeat checks between servers
Continuous data synchronization
System monitoring active

Failover Process:

Heartbeat interruption detected
Passive server activates
IP address migration occurs
Services resume on passive server
System alerts generated

Standby Modes:

Hot Standby:

Passive server running and ready
Minimal startup time
Higher resource usage
Faster failover
More expensive

Cold Standby:

Passive server inactive
Longer startup time
Lower resource usage
Slower failover
More cost-effective

Disadvantages:

Additional hardware costs
Complex configuration
Potential data loss during failover
Resource underutilization
Higher maintenance overhead

48
Q

How does active-active failover work and what are its characteristics?

A

Active-active failover (also known as master-master failover) is a more complex availability pattern where:

Core Characteristics:

Multiple active servers
Load distribution across servers
Simultaneous traffic handling
Synchronized data states
No standby resources

Implementation Requirements:

Public-Facing Systems:

DNS configuration for multiple IPs
Load balancer configuration
Health checking mechanisms
Traffic distribution rules
Failover procedures

Internal Systems:

Application awareness of multiple servers
Connection management
Load distribution logic
State synchronization
Conflict resolution

Advantages:

Better resource utilization
Higher throughput capacity
Natural load balancing
Improved fault tolerance
Easier maintenance procedures

Challenges:

Complex data synchronization
Potential consistency issues
More sophisticated monitoring needed
Higher implementation complexity
Increased operational overhead

49
Q

What are the key considerations for availability percentages and their implications?

A

Availability measurements and calculations involve:

Availability Metrics:

Three Nines (99.9%):

8h 45min 57s downtime per year
43m 49.7s downtime per month
10m 4.8s downtime per week
1m 26.4s downtime per day
Suitable for non-critical systems

Four Nines (99.99%):

52min 35.7s downtime per year
4m 23s downtime per month
1m 5s downtime per week
8.6s downtime per day
Required for critical systems

Calculation Patterns:

Sequential Systems:

Availability decreases with each component
Formula: A(total) = A(component1) * A(component2)
Example: Two 99.9% components = 99.8% total
More components reduce overall availability
Requires higher component reliability

Parallel Systems:

Availability increases with redundancy
Formula: A(total) = 1 - (1-A(comp1)) * (1-A(comp2))
Example: Two 99.9% components = 99.9999% total
Better fault tolerance
Higher overall availability

Implementation Considerations:

Cost vs availability requirements
System architecture decisions
Component reliability needs
Monitoring and alerting thresholds
Maintenance windows impact