Challenges with Relational Database Flashcards
Throw hardware at the problem by adding more memory, adding faster processors, and upgrading disks
Vertical Scaling
What happens when you max out on vertical scaling? And what are the new challenges you face because of that?
Add hardware in the form of additional boxes in a database cluster. Now we have new problems like:
- Data Replication
- Consistency during regular usage and failover times
What is the problem that arises when keeping a caching layer in front of a database?
Consistency problem between updates in cache and updates in database, which is even more amplified in a cluster.
2 main reasons why relational databases were successful?
- SQL
2. Transactions and ACID
What is atomic in ACID?
All or nothing. The transfer requires a debit from one account and a credit to another account. This operation cannot be subdivided; they must both succeed.
What is consistent in ACID?
Data moves from one correct state to another correct state. if a transaction attempts to delete a customer and their order history, it cannot leave order rows that reference the deleted customer’s primary key; this is an inconsistent state
What is isolate in ACID?
Transactions executing concurrently will not become entangled with each other; they each execute in their own space. If two different transactions attempt to modify the same data at the same time, then one of them will have to wait for the other to complete.
What is durable in ACID?
Once a transaction has succeeded, the changes will not be lost.
How do we handle transactions spanning multiple shards of a database?
A transaction manager to orchestrate across the multiple nodes.
What is 2 phase commit?
It is used to achieve consensus in distributed systems with 2 sets of interactions between hosts known as Prepare phase and commit phase.
What kind of operations is suitable for 2 PC?
It is useful only for operations that can complete very quickly
How does 2 phase commit works?
Two-phase commit blocks. clients (“competing consumers”) must wait for a prior transaction to finish before they can access the blocked resource.
The protocol will wait for a node to respond, even if it has died. Possible to avoid waiting forever in this event, because a timeout can be set that allows the transaction coordinator node to decide that the node isn’t going to respond and that it should abort the transaction.
Is there any chance of an infinite loop in 2 PC?
Yes. A node can send a message to the transaction coordinator node agreeing that it’s OK for the coordinator to commit the entire transaction. The node will then wait for the coordinator to send a commit response (or a rollback response if, say, a different node can’t commit); if the coordinator is down in this scenario, that node conceivably will wait forever.
How is the shortcoming of 2 PC handled?
With an idea of compensation. The operation is immediately committed, and then in the event that some error is reported, a new operation is invoked to restore the proper state.
How the database schema gets polluted with complex relationships?
For many-many relation, we have to join the table. you have to create a join table. This pollutes a pristine data model, where you’d prefer to just have students and courses. forces you to create more complex SQL statements to join these tables together. The join statements, in turn, can be slow.