Chapter 3: Replication, CAP Flashcards

Question 1

Q

Where to Place the Replicas?

Answer

A

Within a cloud data center, replica placement is often done with respect to the network hierarchy
- One replica on another machine
- Guards against individual node failures
- One replica on another rack
- Guards against outages of the rack switch
On a global scale, replicas are often distributed with regard to the client locations

Question 2

Q

Who can create the copies for Replication?

Answer

A

Server-initiated replication
- Copies are created by server when popularity of data
item increases
- Mainly used to reduce server load
- Server decides among a set of replica servers
Client-initiated replication
- Also known as client caches
- Replica is created as result of client‘s response
- Server has no control of cached copy anymore
- Stale replicas handled by expiration date
- Traditional examples: Web proxies

Question 3

Q

What Happens at Replication when the Data Changes? steps 1. - 3.

Answer

A

Invalidation protocols
- Inform replica servers that their replica is now invalid
- Good when many updates and few reads
Transferring the modified data among the servers
- Each server immediately receives latest version
- Good when few updates and many reads
Don‘t send modified data, but modification commands
- Good when commands substantially smaller than data
- Assumes that servers are able to apply commands
- Beneficial when network bandwidth the scarce

Question 4

Q

Explain Push- and Pull- based updates

Answer

A

Push-based updates (server-based protocols)
- Server pushes updates to replica servers
- Mostly used in server-initiated replica setups
- Used when high degree of consistency is needed
Pull-based updates (client-based protocols)
- Clients request updates from server
- Often used by client caches
- Good when read-to-update ratio is low

Question 5

Q

What are Issues of Push- and Pull- based updates

Answer

A

State of server
Messages sent
Response time at client

Question 6

Q

Which two Views On Consistency exist?

Answer

A

Data-centric consistency models
- Talk about consistency from a global perspective
- Provides guarantees how a sequence of read/write operations are perceived by multiple clients
Client-centric consistency models
- Talk about consistency from the client‘s perspective
- Provides guarantees how a single client perceives the state of a replicated data item

Question 7

Q

Two groups of Data-Centric Consistency Models

Answer

A

Strong consistency models
- Operations on shared data is synchronized
- Strict consistency (related to time)
- Sequential consistency (what we are used to)
- Causal consistency (maintains only causal relations)
Weak consistency models
- Synchronization only when data is locked/unlocked
- General weak consistency
- Release consistency
- Entry consistency

Question 8

Q

5 Client-Centric Consistency Models (Client-Centric)

Answer

A

Eventual Consistency
Monotonic Reads
Monotonic Writes
Read Your Writes
Writes Follow Reads

Question 9

Q

Characteristics of Eventual Consistency (Client-Centric)

Answer

A

all replicas will reach the most recent state at some point of time
client can read from everywhere
most common used by big cloud providers

Question 10

Q

Characteristics of Monotonic Reads (Client-Centric)

Answer

A

Teh client reads always from the servers which has all writes that the cllient previously read.

Question 11

Q

Characteristics of Monotonic Writes (Client-Centric)

Answer

A

the client can only write on servers where he his previous writes have been completed

Question 12

Q

Characteristics of Read Your Writes (Client-Centric)

Answer

A

All write operations of a client will always be seen by a successive read operation of the same client.

Question 13

Q

Characteristics of Writes Follow Reads (Client-Centric)

Answer

A

A write operation after a read operation can only be performed after all preceding writes have been performed.

Question 14

Q

General Remark on Consistency

Answer

A

In general, the stricter the consistency model, the more it impacts the scalability of a system
- More consistency requires more synchronization
- While the data is synchronized, some client requests
may be answered
- Databases of the 80s and 90s put strong emphasis on consistency, lived with limited scalability/availability
- Today‘s cloud databases often sacrifice consistency in favor of scalability and availability

Question 15

Q

Explain the Brewer’s CAP Theorem

Answer

A

In a distributed system, it is impossible to provide all three of the following guarantees at the same time:
- Consistency: Write to one node, read from another node will return something no older than what was written
- Availability: Non-failing node will send proper response (no error or timeout)
- Partition tolerance: Keep promise of either consistency
or availability in case of network partition

Question 16

Q

What stats the ACID properties in the context of databases for?

Answer

A

Atomicity - Atomarität (Abgeschlossenheit)
Consistency - Konsistenzerhaltung
Isolation - Isolation (Abgrenzung)
Durability - Dauerhaftigkeit

Question 17

Q

Why Replication and not only Partitioning?

Answer

A

Partitioning helps with scalability but not availability
- Data is only stored in one location
- If machine goes down, data is gone
Replication can improve both scalability and availability!

Question 18

Q

Characteristics of Strict Consistency (Data-Centric)

Answer

A

Any read operation returns the value stored by the most recent write operation.

Question 19

Q

Characteristics of Sequential Consistency (Data-Centric)

Answer

A

Any reads in a sequence return the last writes in sequence.

Question 20

Q

Characteristics of Causal Consistency (Data-Centric)

Answer

A

as long as the writes were not potentially depending, a different read order of concurrent writes is ok.