Distributed Databases Flashcards

Question

Recovery in DDBMS

Answer 1

Global transactions have faulty ways of working, since node faults could result in the whole system having to be rolled back (which could not happen, breaking atomicity). Therefore we use distribution commits.

Answer 2

Contains coordinators that execute at some node and decides if and when local transactions can commit. Logging is used here to log messages sent to and received from other nodes at each node locally.

Answer 3

Phase 1 -> Decide whether we commit or abort. Phase 2 -> Either commit or abort.

Answer 4

After receiving a message from the coordinator, lets say prepare T, the nodes individually have to decide whether they are ready to commit or not: - if they are all ready, we go into a pre-committed state and send back ready T - if there is an abort, we send back don't commit T to the coordinator and abort the local transaction

Answer 5

Once phase 1 is successful, the coordinator sends the message commit T to all nodes. If the coordinator receives at least one don't commit message, or nothing at all from one node for some time, it sends a message back to all methods saying abort T.

Answer 6

The stages: - to log - send prepare T as message The node then can either send don't commit or ready. - if the node sends don't commit, we first log and then send message to coordinator, which will eventually instruct the node to abort T. - if the node is ready, it log , and sends ready T to the coordinator

Answer 7

If some nodes don't respond, or have responded don't commit, the coordinator logs and sends that message to the nodes. Otherwise, it logs and sends the message to commit.

Answer 8

The issue with two phase is that if the coordinator and some transaction crashes, all the while everyone else is in the pre-committed state. Depending if we commit or abort, we could break ACID properties. Therefore, we split phase 2 into 2 parts: - Phase 2(a) -> prepare to commit. Send the decision to all nodes, and nodes go into prepare to commit state. - Phase 2(b) -> the old phase 2. Essentially, if the coordinator goes down, then the nodes will receive nothing, and therefore they will know to abort.

Answer 9

The issue with query processing is that if we have a distributed database, then the data we want in our query can only be found other databases. Therefore we send a request, resulting in slow queries. So instead, we use joins to help us, specifically a semi-join.

Answer 10

⋉ Used to represent the rows in a specific table that would be joined using join methods. The specific table depends if we use left or right semi-join.

Answer 11

R ⋉ S Means that all tables that would have joined in R with some row in S are displayed.

Answer 12

R ⋊ S Means that all tables that would have joined in S with some row in R are displayed.

Distributed Databases Flashcards

(36 cards)