Module 10c - Distributed Commit and Checkpoints Flashcards

1
Q

ACID is a collection of properties that we expect in a database. What does ACID stand for? What does each letter mean?

A

Atomicity - All the updates take effect or none of them do

Consistency - Constraints are preserved

Isolation - Concurrent transactions are unaware of each other

Durability - Updates made by committed transactions are not lost in the event of a failure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The distributed commit problem concerns transaction _______ in a ________ environment

A

atomicity

distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Achieving atomicity is easy in a centralized database, but it is harder in a distributed database. Why?

A

Different components of the database could fail independently during a transaction. that makes it difficult to make sure that updates took place at either all of them or none of them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The coordinator is a _______ node which is responsible for ensuring ________

A

separate/independent

atomicity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In a distributed database system, transaction commitment is initiated by a _________ after the transaction _________ phase, during which each _________ discovers whether they are able to commit the transaction or not

A

coordinator
execution
participant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do participants communicate to the coordinator on whether they should commit or abort a transaction?

A

The coordinator asks each participant for a vote on whether they should commit or abort the transaction globally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What happens after the coordinator collects the votes from the participants?

A

The coordinator computes the global decision after collecting votes, and then shares the decision with the participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What must happen to the transaction if ANY of the participants vote to abort?

A

The transaction must abort globally whenever ANY participant votes to abort

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If there are no failures in the system, and every participant votes to commit, what must happen?

A

The transaction must commit globally whenever all participants vote to commit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Two-phase commit (2PC) is a common solution based on two phases.

What happens during phase 1 of 2PC?

A

Phase 1:

Coordinator asks participants whether they are ready to commit. Participants respond with votes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Two-phase commit (2PC) is a common solution based on two phases.

What happens during phase 2 of 2PC?

A

Phase 2:
Coordinator examines votes and decides the outcome of the transaction. If all participants vote to commit, then the transaction is committed successfully.

Otherwise it is aborted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the 4 assumptions made with Two-phase commit (2PC)?

A
  • Synchronous processes (can be executed without interruption from start to finish)
  • Bounded communication delays
  • Crash-recovery failures
  • Processes have access to stable storage for logging recovery information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does the state transition diagram look like for the <strong>coordinator</strong> in Two-Phase commit (2PC)?

A

https://media.discordapp.net/attachments/213179086351106048/1003046928721924207/unknown.png

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the state transition diagram look like for the <strong>participant</strong> in Two-Phase commit (2PC)?

A

https://media.discordapp.net/attachments/213179086351106048/1003047663886929960/unknown.png

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In 2PC, what are the set of actions committed by a coordinator that can be used to create a coordinator program?

A
  1. multicast “vote request” to all participants
  2. listen in on all the votes from all the participants, record all the incoming votes.
  3. If a timeout has been invoked, multicast “global abort” to all participants, and exit the program/process
  4. If all participants sent “vote commit” to the coordinator AND the coordinator votes “commit”, then multicast “global commit” to all participants
  5. Else (if #4 is not true), multicast “global abort” to all participants
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In 2PC, what are the set of actions committed by a participant that can be used to create a participant program?

Don’t include the daemon process that executes in the participant

A
  1. Listen in on the coordinator, wait for a message from the coordinator
  2. If a timeout is invoked, then abort and exit the program/process
  3. Vote either commit or abort
  4. If vote was to abort, then send “vote abort” to coordinator, abort the commit, and terminate the program
  5. If vote was to commit, then send “vote commit” to coordinator, and then wait for decision from coordinator on whether a commit should be made or not. Decide to commit or abort accordingly
  6. If a timeout is invoked while waiting for the coordinator to make a decision (no decision made), then multicast “decision request” to other participants, wait until decision is received, and commit or abort accordingly. Note that in this case, the participant may remain blocked forever while waiting for a decision from other participants.
17
Q

In 2PC, the participant executes a daemon process in the background in addition to its core procedure. What are the set of actions executed by this separate process?

A

Run the following in an infinite loop:

  1. Block on waiting for an incoming “decision request” to come in from other participants
  2. When unblocked, read the most recently recorded state from the local log
  3. If the local log state is “global commit”, then send “global commit” back to that participant which the request was received from
  4. If the local log state is “init” or “global abort”, then send a “global abort” back to the participant which the request was received from
  5. Repeat step 1 infinitely
18
Q

In 2PC, if a participant does not receive a commit or abort decision from the _______, within a ________ period of time, then it may learn the decision from another ________

A

coordinator
bounded
participant

19
Q

In 2PC, when a participant P tries to learn the decision (or commit or abort) from another participant Q, what are the set of actions that P would make based on Q’s state being:

  1. COMMIT
  2. ABORT
  3. INIT
  4. READY

note that Q is the remote participant, whereas P is the local participant

A
  1. On commit, P will make a transition to commit
  2. On abort, P will make a transition to abort
  3. On init, P will make a transition to abort
  4. On ready, P will contact another participant

Note: whenever all participants are in state READY and they are asking around to see what the decision should be, the system will wait forever

20
Q

In 2PC, if the coordinator crashes, is a participant able to make progress?

How would the participant know what the decision should be to make progress on?

A

Yes the participant can make progress.

The participant would have to have received the decision from the coordinator despite the crash, or learn the decision from other participants

21
Q

In 2PC, Following a coordinator crashing, under what condition is it safe for a transaction to be committed?

A

If all participants voted to commit (all are in READY or COMMIT state), then it is safe to commit

22
Q

In 2PC, Following a coordinator crashing, under what condition is it safe for a transaction to be aborted?

A

If ANY of the participants are not in the COMMIT or READY state, then it is safe to abort

23
Q

In 2PC, what happens if both a coordinator and a participant crash?

A

A smarter implementation of 2PC is needed (outside the scope of this course)

Generally, the coordinator or the participant needs to wait for the other party to recover, reload its logs, and continue

24
Q

What is a distributed checkpoint?

A

A collection of checkpoints taken by numerous processes.

Allows inter-process communication and coordination

25
Q

What is a distributed snapshot in context to distributed checkpoints?

A

In a distributed snapshot, if a process P has recorded the receipt of a message, then there should also be a process Q that has recorded the sending of that message

26
Q

Why do we need distributed checkpoints?

A

If a part of a distributed system or the entire distributed system fails, then the checkpoint is used for recovery

27
Q

For distributed checkpoints, in what case is recovery possible?

A

Recovery is possible only if the collection of checkpoints taken by individual processes forms a distributed snapshot

28
Q

What is the most recent distributed snapshot is known as?

A

recovery line

We use it in backwards error recovery

29
Q

How do we check if two checkpoints from two processes form an inconsistent collection of checkpoints? (not a distributed snapshot)

A
  • We take the two process-local checkpoints in a distributed checkpoint.
  • We then connect them using a dashed line
  • If there is a message that is sent on the RIGHT side of the dashed line, but then is received on the LEFT side, then we have a violation of the distributed snapshot

Otherwise, we have a distributed snapshot

30
Q

In distributed checkpoints, what is the domino effect?

A

If the most recent checkpoints taken by processes do not provide a recovery line, then successively earlier checkpoints must be considered. This can keep cascading back all the way until the initial states are reaches as the most recent distributed snapshot

31
Q

The “coordinated checkpointing algorithm” is used to deal with the ______ ______ in distributed checkpoints. It ensures that a ______ _____ is created

A
domino effect
recovery line (or distributed snapshot)
32
Q

What happens in the 1st phase of the “coordinated checkpointing algorithm”?

A

Phase 1:
1. The coordinator sends a CHECKPOINT_REQUEST message to all processes

  1. A process, upon receiving this message pauses sending new messages (temporarily creates a queue for all outgoing messages)
  2. The process takes a local checkpoint (excluding messages in the queue since it is paused)
  3. The process returns an acknowledgement to the coordinator that the checkpoint has been created
33
Q

What happens in the 2nd phase of the “coordinated checkpointing algorithm”?

A

Phase 2:
1. The coordinator will receive acknowledgements from all the processes that their checkpoints have been created

  1. The coordinator multicasts CHECKPOINT_DONE to all processes
  2. All processes resume processing messages (unpause their blocked queue)
34
Q

How does “coordinated checkpointing algorithm” ensure that there will always be a distributed snapshot created?

A

The coordinator sending a CHECKPOINT_REQUEST flag, and the receiving processes pausing all new messages until the CHECKPOINT_DONE flag is sent ensures that there cannot be any messages in between the dashed line when connecting checkpoints across processes