Tutorial 5: Consistency Flashcards
What is a consistency model?
A contract between a distributed data store (e.g., distributed database, shared memory, shared files, etc.) and a set of processes, which specifies what the results of read/write operations are in the presence of concurrency.
Why do we need consistency models in distributed systems?
With concurrent read & write operations on a distributed data store there is a potential for inconsistencies. Different consistency models describe different degrees of consistency that are tolerable for the application that resides on top of it. When designing such distributed systems a consistency model helps to identify the desired consistency level. In most cases the cost is traded off against the guarantees. High consistency is expensive but might not always be needed.
What is the difference between (data-centric) consistency models and client-centric consistency models?
(Data-centric) consistency models care about system-wide consistency, whereas client-centric consistency models focus on the client point of view.
Name two advantages and disadvantages of replication, respectively.
Advantages:
- Performance (replicas can increase the throughput, especially of read operations)
- High Availability (e.g., when a single machine fail)
- Fault-tolerance
Disadvantages:
- Communication/Storage cost
- Stale (inconsistent) data
Name three use cases for replication? Why is replication used in the three cases?
- Distributed File Systems e.g. GFS, HDFS → Data blocks are replicated
- Ensure availability of data
- Provide recovery mechanisms in case of a system crash
- Database replication e.g. MySql, Cassandra → Rows of tables are replicated
- Ensure fault tolerance
- Distribute the load of reads and writes (increase performance)
- Data Warehousing, run analytical queries
- File-based replication e.g., backup solutions → Files are replicated
- Store data/files at different places
What is Strict Consistency?
means that any read on a resource x returns a value corresponding to the result of the most recent write on x. This definition implicitly assumes the existence of absolute global time.
What is Linearizability?
means that all operations that are ordered in real-time are similarly ordered in the execution sequence. We assume ordering according to a set of synchronized clocks. This implies the operation can be executed at an arbitrary point within a given interval.
What is Sequential Consistency?
means that any valid interleaving of read and write operation is acceptable behavior, but all processes see the same interleaving of operations. Nothing is said about time.
Forget Time - ORDER!
What is Causal consistency?
means that concurrent writes do not need to be seen in the same order by all readers. Causally related writes, however, must be seen in the same order by every process. This means processes can possibly have different execution sequences. Nothing is said about time.
Wenn parallell Schreibe write von P1..Pn in eine Zeile, dann Read - wenn ok - ok.
Bei nicht parallel wird nach Prozessen unterschieden.
Alle Ergebnisse in Form P1: … , P2: ….
What is FIFOconsistency?
FIFO consistency means write operations by a single process are seen by all other processes in the order in which they were issued, but writes from different processes may be seen in different orders by different processes.
What is Monotonic-read?
means that once read, subsequent reads on that data item return the same or a more recent value.
What is Monotonic-write?
that a write operation by a process on a data item x is completed before any subsequent write operation can be issued on x by the same process.
Order of w(x1)w(x2)… for each Location
What is Read-your-writes?
means that the effect of a write operation by a process on data item x will always be seen by a successive read operation on x by the same process.
What is Writes-follow-reads?
means that a write operation by a process on a data item x following a previous read operation on x by the same process, is guaranteed to take place on the same or a more recent value of x that was read.
w(x)1 -> r(x)1
ws(x1,x2) -> r(x2)
ws(x1,x2,x3) -> r(x3)