Chapter 6: Consistency & Replication Flashcards
Why Replication?
-
Performance
- Caching data at browsers and proxy servers
- e.g. Content Delivery Network with several locations
-
High availability
- Upon crashes, service offered by replica
- Upon network partition, service available to clients in partition
-
Fault tolerance
- Providing reliable service in face of faulty servers
The “cost” of replication NN
- Cost to keep replicas up to date in face of updates?
- E.g., additional bandwidth, number of messages exchanged, higher latency (i.e., service‐response time), complexity of code, etc.
- How can the problem of stale (out‐of‐date) data current reads at replicas be overcome?
Consistency Models
- Definition (Consistency model)
- A contract between a distributed data store and a set of processes, which specifies what the results of read/write operations are in the presence of concurrency
- Distributed data store as synonym for
distributed database, shared memory, shared files, etc.
Consistency Violation
- With concurrent read & write operations on a distributed data store, data inconsistency may arise
data:image/s3,"s3://crabby-images/6d671/6d671527df9941098acb214835a57a93241e5f4a" alt=""
Overview: Consistency Models important - know lal
- Strict consistency
- Sequential consistency
- Linearizable consistency
- Causal consistency
- FIFO consistency
- Weak consistency
Strict Consistency
- Definition: Any read on a data item x returns a value corresponding to the result of the most recent write on x
- Uni‐processor systems have traditionally observed strict consistency, … but what about multi‐processor systems?
a = 1; a = 2; print(a); Output? most recent ! - Definition assumes existence of absolute global time for unambiguous determination of “most recent“.
- What do we know about uniquely time stamping events in distributed systems via absolute time references?
- Under strict consistent, all writes are instantaneously visible to all processes and absolute global time order is maintained
- Similarly, if a read is done, then it gets the most recent value, no matter how quickly the next write is done
data:image/s3,"s3://crabby-images/138a2/138a25481c1b17aa5b14104731876fd64931f28d" alt=""
Strict Consistency: Thought Experiment
- To satisfy strict consistency, laws of physics may have to be violated! Obviously, not an option!!!
- To realize strict consistency in this case, R(X)a would have to travel at 10 times the speed of light!
- Strict consistency is impossible to guarantee in distributed systems, due to reliance on precise global time in its definition
Definition of Sequential Consistency
- The result of any execution is the same as if the operations by all processes on the data store were executed in some sequential order and …
- … the operations of each individual process appear in this sequence in the order specified by its program
- Operations refer to read and write
- Weaker than strict consistency
- When processes run concurrently, any valid interleaving of read and write operation is acceptable behavior, but all processes see the same interleaving of operations
- Nothing is said about time, i.e., no reference to “most recent”, unlike strict consistency
- Property: if the read time is r, the write time is w, and the minimal packet transfer time between nodes is t, then r+w>=t holds (if you reduce reading time, writing time goes up)
data:image/s3,"s3://crabby-images/0e6ff/0e6fffbf1c919953645aefc5da1a476f962cd602" alt=""
Definition of Linearizability
- Like Sequential consitency
- If there is an overlap order is not important - if not it is
- In addition, if tsOP1(x) < tsOP2(y), then operation OP1(x) should precede OP2(y) in this sequence (t is defined as intervals)
- tsOP(x) denotes the timestamp assigned to operation OP that is performed on data item x, and OP is either read (R) or write (W).
- Like strict consistency, assumes global time, but not absolute global time
- Assumes processes in the system have physical clocks synchronized to within an error bounded
- If W(x)b was the most recent write operation and there is no other write operation overlapping with W(x)b, then any later read should return b
- If W(x)a and W(x)b were two most‐recent overlapping write operation, then any later read should return either a or b, not something else
- A linearizable data store is also sequentially consistent
- I.e., linearizability is more restrictive
- Difference is the ordering according to a set of synchronized clocks
data:image/s3,"s3://crabby-images/f929f/f929f1ef05abeb6ec0e8a81e6f42063694a00ee2" alt=""
Intuition for Causal Consistency
data:image/s3,"s3://crabby-images/98733/98733afe732786247e43e22ffa4e63a6fcfa1639" alt=""
- Weaker than sequential consistency
- Distinguish events that are potentially causally related and those that are not (concurrent events) Rb
- If event B is influenced (caused) by an earlier event A, causality requires that everyone first sees A then B
- Concurrent events (i.e.,writes) may be seen in a different order on different machines
- Reading of x and writing of y are potentially causally
related
- Computation of y may have depended on value of x read by P2 (written by P1); e.g., y = f(x)
- On the other hand, y may not have depended on x, yet potential causality still holds in our formalization!
- Read followed by write in the same process, the two are potentially causally related
- Example: R(x)a … W(y)b (e.g., it may be that y=f(x))
- Read is potentially causally related to write that provided the value read got
- Example: W(x)a … R(x)a
- Independent writes by two process on a variable are not causally related (they are concurrent)
- Example: W(x)a … W(x)b
- Potentially causally related writes must be seen by all processes in the same order
- Concurrent writes may be seen in a different order by different processes
data:image/s3,"s3://crabby-images/ddf66/ddf66e781d6dde0946acebb53e0cd74571a5d16c" alt=""
Definition of FIFO Consistency
- Writes by a single process are seen by all other processes in the order in which they were issued
- Writes from different processes may be seen in a different order by different processes
data:image/s3,"s3://crabby-images/fff75/fff752dd3d0ea1ecfa8702b01a5241632a6ffefe" alt=""
Summary of Consistency Models
data:image/s3,"s3://crabby-images/e5d07/e5d0745729e2f1d764af519a7be3cf7f7a9e3573" alt=""
Eventual Consistency Motivation
- So far, goal was to maintain consistency in presence of concurrent read/write operations
- There are use cases with no (few) updates to the same data, while most operations are reads
- E.g., distributed database, DNS, Web, K/V‐Stores
- Majority of operations are reads
- Writes are mostly done by central authorities (domain owners, Web masters, etc.)
- Updates are keyed and arrive from single point, only
- Few concurrent updates and updates propagate lazily
- If no updates take place for a long time, all replicas will gradually become consistent
Eventual Consistency
- Eventual consistency is desirable for large‐scale distributed systems where high availability is important
- Tends to be cheap to implement
- Constitutes a challenge for environments where strong consistency is important
- How well does eventual consistency work for mobile user? For high load of requests not suitable
- Here, we need another class of consistency, i.e., client-centric consistency models
data:image/s3,"s3://crabby-images/a0817/a081799bd6be9bbab81522b4a112e87fdd3479e3" alt=""
Client‐Centric Consistency Models
- Goal: How can we avoid system‐wide consistency by concentrating on what clients want, instead of what should be maintained by servers.
- Different client‐centric consistency models – Monotonic reads
- Monotonic writes
- Read‐your‐writes
- Write‐follows‐reads
Definition: Monotonic Reads
- A data store provides monotonic‐read consistency if the following condition holds:
- If a process reads the value of a data item x, any successive read operation on x by that process will always return that same value or a more recent value
- Monotonic‐read consistency guarantees that if a process has seen a value of x at time t, it will never see an older version of x at a later time
- The previously seen value by the process must have reached the new site, where the reader wants to read
data:image/s3,"s3://crabby-images/85057/8505719ec6bd208d4d348f86c63d3a5c2506c5d7" alt=""
Monotonic Writes
- Propagation of writes in the correct order to multiple destinations (copies of data store)
- A data store provides monotonic‐write consistency if the following condition holds:
- A write operation by a process on a data item x is completed before any successive write operation on x by the same process
- In other words, a write operation on a copy of data item x is performed only if that copy has been brought up to date
data:image/s3,"s3://crabby-images/3d169/3d1698fd60b59b4650af103eb33017fc52d770be" alt=""
Read Your Writes
- A data store provides read‐your writes consistency if the following condition holds:
- The effect of a write operation by a process on data item x will always be seen by a successive read operation on x by the same process
- A write operation is always completed before a successive read operation by the same process, no matter where that read operation takes place
data:image/s3,"s3://crabby-images/8700a/8700a5dd5d381bebd12a42b358ee9603c71944ad" alt=""
Writes Follow Reads
- A data store is said to provide writes‐follow‐reads consistency if the following condition holds:
- A write operation by a process on a data item x following a previous read operation on x by the same process, is guaranteed to take place on the same or a more recent value of x that was read
- E.g., … R(x) … W(x) (for W, same or more recent value of x)
- Any successive write operation by a process on a data item x will be performed on a copy of x that is up to date with the value most recently read by that process
data:image/s3,"s3://crabby-images/da81c/da81c6b2deb2a63c0201362686d288eb76a98694" alt=""
Replication Techniques Satisfying Sequential Consistency
- • Two main techniques
- Primary Backup
- Active Replication
System Model
- Processes are either clients or replicas
- Clients and replicas exchange messages through point‐to‐point links
- Processes can crash
Passive Replication: Primary Backup
-
Primary
- Receives invocations from clients and sends back the answers (clint talks always to primary)
-
Backup
- Interacts with the primary
- Is used to replace the primary when it crashes
Primary Backup Scenario
Guarantee sequential consistency: Order in which the primary receives clients’ invocations define the order of the operation on the data item(s).
data:image/s3,"s3://crabby-images/355df/355df893f09687b5c73fb79114c1d1989add66ff" alt=""
Primary Backup: Presence of Crash
-
Three scenarios
- Primary fails after the client receives the answer -> continue
- Primary fails before sending update messages -> details needed
- Primary fails after sending update messages and before receiving all the ACK messages –> leader election
- In all cases, a new primary is elected from among the backups
Primary fails after the client receives the answer
- Nothing bad happens since the client has already received the response
- A new primary is elected
data:image/s3,"s3://crabby-images/cf8b1/cf8b1c343601c6911cc2e3241bb5d6311580a31e" alt=""
Primary fails before sending update messages
- Client does not get an answer and resends the requests after a timeout
- The newly elected primary will handle the request as new
data:image/s3,"s3://crabby-images/e73f0/e73f00ad040c9a867372c6795ddd48ef722d47b5" alt=""
Primary fails after sending update messages and before receiving all ACK messages
- When a primary fails, a new primary is elected by the backup replicas
- Client does not get an answer and resends the requests after a timeout
- Since the new primary has already processed the update message, it immediately sends the response to the client without further action (we need a log file for processed requests)
data:image/s3,"s3://crabby-images/5cf4f/5cf4f5d162b7af73f38101787347fb8e3a8768e4" alt=""
Active Replication
- There is no coordinator, all replicas play the same role
- Each replica is deterministic: If all replicas start from the same state and receive the same input, they will produce the same output
- As a matter of fact, clients will receive the same response, one from each replica
data:image/s3,"s3://crabby-images/db992/db992900eba3b8ffb042cf91514faa88c2d04c27" alt=""
Gossip Architecture
- Gossip architecture aims at achieving high availability
- Does not offer strong consistencies guarantees such as sequential consistency
- Updates are propagated in a lazy fashion
- Response received by a client from a replica to its query should not be older than the last response that the client had received from any replicas
- Client and replicas are separate processes
- Replicas periodically exchange ‘gossip’ messages to convey updates to other replicas
- Front end (or client directly) communicates with an individual replica, but it may interact with other replicas, e.g., to tolerate failures
data:image/s3,"s3://crabby-images/22bf0/22bf0b72c4bce9991b5dc4ff4c5ee62f8afc7880" alt=""
Message Types in Gossip Architecture
- Query message: A message from a client to its closet replica to determine the state of the system
- Update message: A message from a client to the closest replica to update the state of the system
- Gossip message: This message contains some updates that a replica thinks that some other replica might not have received (send to a number of other replicas)