Module 9a - Consistency and Replication (part 1) Flashcards
Why do we need to replicate Data?
- Improves dependability. Data loss can be prevented in the event of a replica failing, since there are numerous copies
- Increases throughput. Replicas can be read/written to in parallel
- Decrease Latency. We can keep a data replica close to the client in different geographical regions
What makes it difficult to design systems that deal with shared mutable states?
We have to account for both Concurrency and Failures
In a replicated data store, each data object (ex: a row in a table) is _______ at multiple ______.
replicated
hosts
A replica of an object may be ______ to a process meaning that it resides on the same host, or it may be _______
local
remote
Why is replicating read-only data straightforward?
Because we don’t have to worry about keeping the data perfectly synchronized across hosts
If a data store holds ______ state, then we can never keep _______ of this state perfectly _______
mutable
replicas
sychronized
Why can’t mutable state/object replicas be perfectly synchronized all the time?
- Variations in processing speeds
- Network Delays
What does a consistency model help us with? and how does it do it?
Make a sense of concurrent reading and updating data objects in a distributed system by describing the extent to which replicas are permitted to disagree on the state of data
What makes it difficult to select a good consistency model?
Application requirements in general do not map neatly to a specific consistency model
Under what condition is a data store sequentially consistent?
Whenever the result of any execution is the same as if the read/write operations by all processes on the data store were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program.
i.e: the order of execution is the same for sequential or concurrent
There are no contradictions in the order of operations in an execution.
How do you prove if an execution is sequentially consistent in practice?
You must brute force every possible outcome of the execution order, and find a path which corresponds to a outcome in which the result is the same for reads after writes
Alternatively, you can construct a graph with all the order dependencies of all operations. If there are no cycles in this graph, then the execution is not sequentially consistent
Under what condition is a data store causally consistent?
Whenever the “causally precedes” condition is met for all processes which read/write the data object in the execution
- Op1 occurs before Op2 in the same process
- Op2 reads a value written by Op1
How do you prove that an execution is causally consistent in practice?
You must show that the total order (Ti) for each process (Pi) has these 3 properties:
- Ti contains all the operations executed by Pi as well as the writes of any values read by Pi, and nothing else
- Each read in Ti returns the value of the most recent write to the same object in Ti (i.e., order is legal)
- If 2 operations occur in Ti and also occur in some other Tj, then they must be consistent in order
______ consistency, _______, and ______ consistency are different ways to define the correct behaviour of operations on shared objects under concurrent access
Sequential
Linearizability
Causal
The _______ property assumes that operations have well-defined start and finish events. which are ordered by a _____ _____
Linearizability
global
clock