Module 13 - CAP principle Flashcards
What does “Consistency” mean in the context of CAP?
Consistency:
Clients agree on the latest state of the data
What does “Availability” mean in the context of CAP?
Availability:
Clients are able to execute both read-only queries and updates
What does “Partition tolerance” mean in the context of CAP?
Partition Tolerance:
System continues to function if the network fails and nodes are separated into disjoint sets
What is the CAP principle?
In the event of a partition (P), the system must choose either consistency (C) or availability (A), and cannot provide both simultaneously
In the event of no partition, what does the CAP principle state about availability and consistency?
During failure-free (partition free), an operation on a system may be simultaneously highly available and strongly consistent
What is a CP system?
In the event of a partition, choose C (consistency) over A (availability)
What is a AP system?
In the event of partition choose A (availability) over C (consistency)
What are AP systems in the real world appropriate for?
AP systems are appropriate for latency-sensitive, inconsistency-tolerant applications
In AP systems, are there transactions?
No transactions in AP systems
Data accessed mostly using get/put operations
What are characteristics of consistency in CP systems?
- Serializability
- Linearizability
- Sequential consistency
- N_r + N_w > N
What are characteristics of consistency in AP systems?
Note that they are much less so than in CP systems
It depends on the type of AP, but some of them can have either of these, or none:
- Eventual consistency
- Causal consistency
What does PACELC mean?
If there is a network Partition, then choose between Availability and Consistency. Else choose between low Latency and Consistency
What is strong consistency in the context of key-value storage systems?
When clients read and write overlapping sets of replicas, then every read is guaranteed to observe the effects of all writes that finished before the read started
Whats the idea behind Tunable consistency?
What are characteristics of tunable consistency systems?
The idea that a system is neither CP or AP, but rather different configurations of them.
These systems use quorum-based replication, and N_r and N_w can be tuned to adjust to behave more like AP to more like CP system
In a distributed system, if there’s no network partition, and we’d like to achieve low latency, what must be give up?
If there’s no partition in the network, then we must give up consistency to achieve low latency (and vice versa)
In a tunable consistency distributed system, how do you ensure that the system is strongly consistent in the context of CAP?
Note: strongly consistent in the context of CAP, means that it is a CP system
Make N_r + N_w > N, this ensures that we can detect read-write conflicts
For a distributed system, if you give up consistency, do you get availability?
Not always. In the event of a network partition in which all replicas of a data object are on 1 side of the partition, then a user which has access to a replica on the other side will not have availability to the system
What are sloppy quorums?
Hinted handoff is an example of sloppy quorums. How does it work?
What assumptions does this idea make? In what case is hinted handoff not necessary?
In Sloppy Quorums, the set of replicas (partial quorums) can change dynamically.
Hinted handoffs allows an arbitrary node to accept an update for a given key, and hold the update until one of the write replicas becomes available.
This assumes that there is partial replication. In a quorum based system with full-replication, then hinted handoff is not necessary.
Hinted handoff ensures _____ availability, but not _____ availability
write
read
Which client-side consistency level is used in Apache Cassandra to enable full write availability in the presence of network partitions?
Does Apache Cassandra support tunable consistency?
Partial quorums
Apache Cassandra supports tunable consistency with (optional) full write availability
In Apache Cassandra, what do the terms: 1. ONE 2. ANY 3. TWO 4. THREE 5. QUORUM 6. ALL mean in terms of replication strategy?
- N_r or N_w = 1
- For writes only, use any number of read/write quorums with hinted handoff
- N_r or N_w = 2
- N_r or N_w = 3
- N_r or N_w = ceiling{(N+1)/2}
- N_r or N_w = N
In Apache Cassandra, what does the coordinator do?
Executes puts and get requests for the client. It is not a centralized service - the coordinator is simply an implementation on each node