8. The Trouble with Distributed Systems Flashcards

1
Q

Problems of distributed systems - 1. Unreliable network

A

Whenever you try to send a packet over the network, it may be lost, reordered, duplicated or arbitrarily delayed. Likewise, the reply may be lost or delayed, so if you don’t get a reply, you have no idea whether the message got through.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Detect faults is hard

A

Most distributed algorithms reply on timeout to determine whether a remote node is still available. However, timeouts can’t distinguish between network and node failures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Tolerate faults is hard

A

There is no global variable, no shared memory, no common knowledge or any other kind of shared state between the machines.

Nodes can’t even agree on what time it is, let alone on anything more profound.

The only way information can flow from one node to another is by sending it over the unreliable network. Major decisions cannot be safely made by a single node, so we require protocols that enlist help from other nodes and try to get a quorum to agree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Problems of distributed systems - 2. Unreliable clocks

A

A node’s clock may be significantly out of sync with other nodes (despite our best efforts to set up NTP), it may suddenly jump forward or back in time, and replying on it is dangerous because you most likely don’t have a good measure of your clock’s error interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Problems of distributed systems - 3. Process pause

A

A process may pause for a substantial amount of time at any point in its execution (perhaps due to a stop-the-world garbage collector), be declared dead by other nodes, and then come back to life again without realizing that it was paused.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly