6 - Consistency and Replication Flashcards
Why Replicate data?
System Performance- lowered latency
System Reliability
Data Consistency
Corrrectness of Data according to a model
Read-write conflict
Two read operations on the same data that read different values
Write-Write Conflict
Two concurrent write operations on same data result in different versions
Data centric consistency model
Specifies what is allowed and what the results of read/write ops are in presence of concurrency
Data store model
Distributed storage collection
Read/Write
Read one, write all (datastores)
Consistency
Tight Consistency
Synchronous Replication
Update every replica before next op
Problems of Tight Consistency
Replicas must agree on next operation
Requires a lot of comms
Delay before next operation.
Consistent Ordering
In terms of what processes work on
Processes work on different copies of the same data
Sequentially Consistent
Any result is the same as if the operations of all processes were executed in some sequential order
All processes execute operations in the issued order
Causally Consistent
causally related writes are exec in same order by all processes
all operations are executed by all processes in the order of issuing
FIFO Consistent
individual processes operations are exec by all in the order they were issued
Hierarchy of Consistencies
(Top Down)
Tight
Sequential
Causal
FIFO
Eventual Consistency
Weak consistency model where no or rare parallel write accesses.
Updates are forwarded at some point.
All eventually consistent
Critical Section
Processes must acquire lock to access critical sections
Continuous Consistency
Divide data into consistency units
Non uniform consistency model
Client Centric Consistency
Consistency guaranteed for the same client
Types of continuous consistency constraints
Absolute/relative numerical division
Relative staleness of data
Order and number of updates
Types of client centric consistency requirements
Monotonic reads/writes
Read Your Writes
Writes Follow Reads
Monotonic Reads
If a process reads item x, any successive read on x by that process is always same or more recent value
Monotonic Writes
A write by process on item x is completed before any further write on x by the same process
Read Your Writes
Effect of write on item x will always be seen by success read operation on x by the same process
Writes Follow Reads
Write on item x following a previous read by same process will take place on same or more recent value of x
Lazy Updates
Server that has not been recently accessed can be updated before it replies (lazy update)
Permanent Replicas
Initial set usually small
At one location or mirrored sites
Server-initiated Replicas
Created ad-hoc
Either at one location or close to client
Clustering
Partition space into cells and place servers at high demand
Client initiated replicas
Caching of data. Usually on the same LAN.
Can have limited size and become stale
Invalidation Protocol
Propagate notifications for updates
Passive Replication
Transfer data from one copy to another
Active Replication
Propagate update operation to other copies
Primary-based protocol: Remote Write
Request forwarded to remote primary which distributes it
Primary-based protocol: Local Write
Replica receiving request is made new primary
Quorum based protocols
Only a certain number of replicas written to (write quorum)
Read a few (read quorum) replicas and return latest version
What expression for n prevents read-write conflicts?
nw + nr > n
The addition of write and read quorums must be more than n or the updates may not be read
What expression for n prevents write-write conflicts?
nw > n/2
The write quorum must exceed n or it is possible to have multiple writes with the same version number
Quorum based protocols: Advantages over ROWA
ROWA requires all replicas to be available but Quorum allows max{n-nw,n-nr} unresponsive.
When single replicas fail or are expensive to update this may be more efficient
ROWA
Read One Write All
What consistency is probably sufficient for a stock market?
Causal.
Changes in stock value should be consistent but independent changes can be seen in different orders
What client-centric consistency is most appropriate for a mobile user’s mailbox?
All of them.
The owner should always see the same mailbox.