15. Distributed Transactions Flashcards

1
Q

How distributed locking is commonly implemented in databases?

A

Since every node contains data that is independent of any other node’s data, every node can maintain its own local lock table. Coarser grained locks for entire tables or the database can either be given to all nodes containing a partition or be centralized at a predetermined node. This design makes locking simple as 2 phase locking is performed at every node using local locks in order to guarantee serializability between different transactions.

When dealing with locking, deadlock is always a possibility. To determine whether deadlock has occurred in a distributed database, the waits-for graphs for each node must be unioned to find cycles as transactions can be blocked by other transactions executing on different nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is consensus in distributed databases?

How is it commonly implemented?

A

In a distributed database, consensus is the idea that all nodes agree on one course of action.

Consensus is implemented through Two Phase Commit and enforces the property that all nodes maintain the same view of the data.

It provides this guarantee by ensuring that a distributed transaction either commits or aborts on all nodes involved. If consensus is not enforced, some nodes may commit the transaction while others abort, causing nodes to have views of data at different points in time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Two-phase commit - 2 main types of nodes

A

1 coordinator and many participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Two-phase commit - names of phases

A
  1. preparation phase
  2. commit/abort phase
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Two-phase commit - preparation phase, describe steps

A
  1. Coordinator sends prepare message to participants to tell participants to either prepare for commit or abort
  2. Participants generate a prepare or abort record and flush record to disk
  3. Participants send a yes vote to the coordinator if prepare record is flushed or no vote if the abort record is flushed
  4. Coordinator generates a commit record if it receives unanimous yes votes or an abort record otherwise, and flushes the record to disk
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Two-phase commit - commit/abort phase, describe steps

A
  1. Coordinator broadcasts (sends a message to every participant) the result of the commit/abort vote based on flushed record (see preparation phase).
  2. Participants generate a commit or abort record based on the received vote message and flush record to the disk
  3. Participants send an ACK (acknowledgment) message to the coordinator
  4. Coordinator generates an end record once all ACKs are received and flushes the record sometime in the future
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Depict the scheme of two-phase commit

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Two-phase commit, what will happen if:

Participant is recovering, and sees no prepare record.

A

– This probably means that the participant has not even started 2PC yet – and if it has, it hasn’t yet sent out any vote messages (since votes are sent after flushing the log record to disk).

– Since it has not sent out any vote messages, it aborts the transaction locally. No messages need to be sent out (the participant has no knowledge of the coordinator ID).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Two-phase commit, what will happen if:

Participant is recovering, and sees a prepare record.

A

– A lot of things could have happened between logging the prepare record and crashing – for instance, we don’t even know if we managed to send out our YES vote!

– Specifically, we don’t know whether or not the coordinator made a commit decision. So the participant node’s recovery process must ask the coordinator whether a commit happened (”Did the coordinator log a commit?”). The coordinator can be determined from the coordinator ID stored in the prepare log record.

– The coordinator will respond with the commit/abort decision, and the participant resumes 2PC from phase 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Two-phase commit, what will happen if:

Coordinator is recovering, and sees no commit record.

A

– The coordinator crashed at some point before receiving the votes of all participants and logging a commit decision.

– The coordinator will abort the transaction locally. No messages need to be sent out (the coordinator has no knowledge of the participant IDs involved in the transaction).

– If the coordinator receives an inquiry from a participant about the status of the transaction, respond that the transaction aborted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Two-phase commit, what will happen if:

Coordinator is recovering, and sees a commit record.

A

– We’d like to commit, but we don’t know if we managed to tell the participants.

– So, rerun phase 2 (send out commit messages to participants). The participants can be determined from the participant IDs stored in the commit log record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Two-phase commit, what will happen if:

Participant is recovering, and sees a commit record.

A

– We did all our work for this commit, but the coordinator might still be waiting for our ACK, so send ACK to coordinator. (The coordinator can be determined from the coordinator ID stored in the commit log record.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Two-phase commit, what will happen if:

Coordinator is recovering, and sees an end record.

A

– This means that everybody already finished the transaction and there is no recovery to do.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Two-phase commit with presumed abort - explain

A

It turns out that two-phase commit still works if

  1. Everybody assumes that no log records means abort
  2. abort records never have to be flushed – not in phase 1 or phase 2, not by the participant or the coordinator.

This optimization is called presumed abort

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Two-phase commit, what will happed (with and without presume abort) if

Participant is recovering, and sees no phase 1 abort record.

A

– Without presumed abort: This probably means that the participant has not even started 2PC yet – and if it has, it hasn’t yet sent out any vote messages (since votes are sent after flushing the log record to disk).

– With presumed abort: It is possible that the participant decided to abort and sent a ”no” vote to the coordinator before the crash.

– With or without presumed abort, the participant aborts the transaction locally. No messages need to be sent out (the participant has no knowledge of the coordinator ID).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Two-phase commit, what will happed (with and without presume abort) if

Participant is recovering, and sees a phase 1 abort record

A

– Without presumed abort: Abort the transaction locally and send “no” vote to the coordinator. (The coordinator can be determined from the coordinator ID stored in the abort log record.)

– With presumed abort: Abort the transaction locally. No messages need to be sent out! (The coordinator will timeout after not hearing from the participant and presume abort.)

17
Q

Two-phase commit, what will happed (with and without presume abort) if

Coordinator is recovering, and sees no abort record

A

– Without presumed abort: The coordinator crashed at some point before reaching a commit/abort decision.

– With presumed abort: It is possible that the coordinator decided to abort and sent out abort messages to the participants before the crash.

– With or without presumed abort, the coordinator will abort the transaction locally. No messages need to be sent out (the coordinator has no knowledge of the participant IDs involved in the transaction).

– If the coordinator receives an inquiry from a participant about the status of the transaction, respond that the transaction aborted.

18
Q

Two-phase commit, what will happen (with and without presumed abort) if:

Coordinator is recovering, and sees an abort record

A

– Without presumed abort: Rerun phase 2 (sending out abort messages to participants). The participants can be determined from the participant IDs in the abort log record.

– With presumed abort: Abort the transaction locally. No messages need to be sent out! (Participants who don’t know the decision will ask the coordinator later.)

19
Q

Two-phase commit, what will happen (with and without presumed abort) if:

Participant is recovering, and sees a phase 2 abort record.

A

– Without presumed abort: Abort the transaction locally, and send back ACK to coordinator. (The coordinator can be determined from the coordinator ID stored in the abort log record.)

– With presumed abort: Abort the transaction locally. No messages need to be sent out! (ACKs only need to be sent back on commit.)

20
Q

When 2PC recovery decision is to commit?

A

The 2PC recovery decision is commit if and only if the coordinator has logged a commit record.

21
Q

What happens with 2PC if some node fails and doesn’t recover at all?

A

Since 2PC requires unanimous agreement, it will only make progress if all nodes are alive. This is true for the recovery protocol as well – for recovery to finish, all failed nodes must eventually come back alive.

If the coordinator believes a participant is dead, it can respawn the participant on a new node based on the log of the original participant, and ignore the original participant if it does come back online.

However, 2PC struggles to handle scenarios where the coordinator is dead.