Distributed Systems Flashcards

Question 1

Q

CAP Theorem

Answer

A

Can only have 2 of the following

Consistency
Availability
Partition

Question 2

Q

What does consistency in the CAP theorem mean?

Answer

A

Every read receives the most recent write or an error

This is different from the consistncy in ACID. ACID consistency means every transaction brings the database from one valid state to another.

Question 3

Q

What does availability mean?

Answer

A

Every request receives a non-error response

Question 4

Q

What is partition tolerance?

Answer

A

The ability of a distributed database to continue processing data even if a network partition causes communication errors between subsystems.

Question 5

Q

5 Nines Uptime

Answer

A

About 5 minutes downtime per year

Question 6

Q

3 Nines Uptimes

Answer

A

< 9 hours of downtime per year

Question 7

Q

System Design

Answer

A

Requirements clarification
System interface definition
Back of the envelope estimation
Defining the data model
High level design
Component design
Identifying and resolving bottlenecks

Question 8

Q

Redundancy

Answer

A

Duplication of data or services with the goal of increased reliability
Removes single point of failures and creates backups
Can also create a shared nothinr architecture where each node can operate independently of the others. Makes it easier to add new servers and systems are more resilient to a single failure (e.g. database).

Question 9

Q

Replication

Answer

A

Process of synchronising state between nodes

Active replication - each message goes to each node

Passive replication - leader/follower relationship (eventual consistency). All writes go to the leader but reads are allowed from the followers. If the leader goes down, a follower gets promoted.

Question 10

Q

What’s wrong with key % n?

Answer

A

Not horizontally scalable
Every time you add a server, the data will need to be repartitioned - all keys will need to be remapped - which almost always involves downtime
Not load balanced since it assumes uniformly distributed data/behavior

Question 11

Q

What is Consistent Hashing?

Answer

A

Minimizes reoragnizatin when nodes are added or removed
When hash table is resized (a node is added or removed from the system), only k/n keys need to be remapped where k is the total number of keys and n is the number of nodes.
When a node is removed from the system, objects on that host are shared among other hosts.
When a node is added, it takes objects from a few nodes without touching all nodes

Question 12

Q

How does Consistent Hashing Work?

Answer

A

As a typical hash function, consistent hashing maps a key to an integer. Suppose the output of the hash function is in the range of [0, 256). Imagine that the integers in the range are placed on a ring such that the values are wrapped around.

Here’s how consistent hashing works:

Given a list of cache servers, hash them to integers in the range.

To map a key to a server,

Hash it to a single integer.

Move clockwise on the ring until finding the first cache it encounters.

To add a new server, say D, keys that were originally residing at C will be split. Some of them will be shifted to D, while other keys will not be touched.

To remove a cache or, if a cache fails, say A, all keys that were originally mapped to A will fall into B, and only those keys need to be moved to B; other keys will not be affected.

For load balancing, as we discussed in the beginning, the real data is essentially randomly distributed and thus may not be uniform. It may make the keys on caches unbalanced.

To handle this issue, we add “virtual replicas” for caches. Instead of mapping each cache to a single point on the ring, we map it to multiple points on the ring, i.e. replicas. This way, each cache is associated with multiple portions of the ring.

If the hash function “mixes well,” as the number of replicas increases, the keys will be more balanced.

Distributed Systems Flashcards

(12 cards)