7. Transactions Flashcards

1
Q

Transaction meaning

A

Transaction is to simplify the programming model for applications accessing a database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Transaction isolation level

A
  1. Read committed isolation (weak)
  2. Snapshot isolation (medium)
  3. Serializable isolation (strong)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ACID - Atomicity (Abortability)

A

No partial failure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ACID - Consistency

A

You have certain statements about your data that must always be true.

Atomicity, isolation and durability are properties of the database, whereas consistency is the property of the application.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ACID - Isolation (Serializability)

A

Concurrently executing transactions are isolated from each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

ACID - Durability

A

Promise that once a transaction has committed successfully, any data it has written will not be forgotten.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Race condition - Dirty reads

A

One client reads another client’s writes before they have been committed.

The read committed isolation level and stronger levels precent dirty reads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Race condition - Dirty writes

A

One client overwrites data that another client has written, but not yet committed.

Almost all transaction implementations prevent dirty writes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Race condition - Read skew (non-repeatable reads) (compared with phantom reads)

A

A client sees different parts of the database at different points in time.

e.g. Alice see 500 in one account, but 400 in the other…

This issue is most commonly prevented with snapshot isolation, which allows a transaction to read from a consistent snapshot at one point in time. It is usually implemented with multi-version concurrency control (MVCC).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Race condition - Lost updates

A

Two clients concurrently perform a read-modify-write cycle. One overwrites the other’s write without incorporating its changes, so data is lost.

Some implementations of snapshot isolation prevent this anomaly automatically, while others require a manual lock (SELECT FOR UPDATE).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Race condition - Write skew

A

A transaction reads something, makes a decision based on the value it saw, and writes the decision to the database. However, by the time the write is made, the premise of the decision is no longer true.

Examples:

  1. Two on-call doctors take sick leave at the same time;
  2. Meeting room booking system;
  3. Multiplayer game;
  4. Claiming a username;
  5. Preventing double-spending;

You can think of write skew as a generalization of the lost update problem. Write skew can occur if two transactions read the same objects, and then update some of those objects (different transactions may update different objects). In the special case where different transactions update the same object, you get a dirty write or lost update anomaly (depend on the timing).

Only serializable isolation prevents this anomaly. If not able to, the second-best option in this case is probably to explicitly lock the rows that the transaction depends on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Race condition - Phantom reads (compared with read skew)

A

A transaction reads objects that match some search condition. Another client makes a write that affects the results of that search.

Snapshot isolation prevents straightforward phantom reads, but phantoms in the context of write skew require special treatment, such as index-range locks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Isolation level - Serializable isolation - meaning

A

Serializable isolation means database guarantees that transactions have the same effect as if they run serially.

It is usually regarded as the strongest isolation level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Isolation level - Read committed isolation - meaning

A

Read committed isolation means:

  1. When reading from database, you will only see data that has been committed (no dirty reads).
  2. When writing to database, you will only overwrite data that has been committed (no dirty writes).

Read committed isolation can handle dirty reads and dirty writes. But can not handle read skew, phantom reads, write skew, lost updates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Isolation level - Read committed isolation - implementation

A
  1. To prevent dirty write, use row-level write locks.

When a transaction wants to modify a particular object, it must first acquire a lock on that object.

  1. To prevent dirty read, it’s NOT performable to use read lock because one long-running write transaction can force many read-only transactions to wait until it’s completed.

Instead, database remembers both old committed and new value set by transaction that currently holds the write lock. While transaction is ongoing, any other transactions that read the object are simply given the old value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Isolation level - Snapshot isolation - meaning

A

Each transaction reads from a consistent snapshot of the database.

Key principle: readers never block writers, and writers never block readers.

Snapshot isolation is a boon for the long-running, read-only queries such as backups and analytics.

17
Q

Isolation level - Snapshot isolation - implementation

A

Database needs to potentially keep several different committed versions of an object, because various in-progress transactions may need to see the state of the database at different points in time.

Because it maintains several versions of an object side by side, it’s called multi-version concurrency control (MVCC).

By never updating values in place but instead creating a new version every time a value is changed, the database can provide a consistent snapshot while incurring only a small overhead.

18
Q

Isolation level - Serializable isolation - implementation

A
  1. Literally executing transactions in a serial order;
  2. Two-phase locking, which for several decades was the only viable option;
  3. Optimistic concurrency control techniques such as serializable snapshot isolation.
19
Q

Isolation level - Serializable isolation - implementation - 1. Actual serial execution

A

Execute only one transaction at a time, in serial order, on a single thread.

Why it’s adoptable compared to multi-threaded concurrency?

  1. RAM became cheap enough that for many use cases is now feasible to keep the entire active dataset in memory.
  2. OLTP transactions are usually short and only make a small number of reads and writes.

A few constrains:

  1. Every transaction must be small and fast, because it takes only one slow transaction to stall all transaction processing.
  2. It it limited to use cases where the active dataset can fit in memory.
  3. Write throughput must be low enough to be handled on a single CPU core, or else transactions need to be partitioned without requiring cross-partition coordination.
  4. Cross-partition transactions are possible, but there is a hard limit to the extent to which they can be used.
20
Q

Isolation level - Serializable isolation - implementation - 2. Two-phase locking (2PL)

A

In 2PL, writers don’t just block other writers, they also block readers and vice versa. It protects from lost updates and write skew.

Implementation:

  1. Reader needs to acquire lock in shared mode.
    If a transaction wants to read an object, it must first acquire the lock in shared mode. Several transactions are allowed to hold the lock in shared mode simultaneously, but if another transaction already has an exclusive lock on the object, these transactions must wait.
  2. Writer needs to acquire lock in exclusive mode;
    It a transaction wants to write an object, it must first acquire the lock in exclusive mode. No other transaction may hold the lock at the same time, so if there is any existing lock on the object, the transaction must wait.
  3. If a transaction first reads and then writes an object, it may upgrade its shared lock to an exclusive lock. The upgrade works the same as getting an exclusive lock directly.

When deadlock, one transaction will be aborted.

Performance

Performance affected less by acquiring and releasing locks, but more due to reduces concurrency.

21
Q

Isolation level - Serializable isolation - implementation - 3. Serializable snapshot isolation (SSI)

A

A young but promising approach. It provides full serializability, but has only a small performance penalty compared to snapshot isolation.

Two-phase locking is pessimistic concurrency control mechanism: it is based on the principle that if anything might possibly go wrong, it’s better to wait. It’s like mutual exclusion, which is used to protect data structures in multi-threaded programming.

Serial execution is, in a sense, pessimistic to the extreme: it is essentially equivalent to each transaction having an exclusive lock on the entire database for the duration of the transaction.

By contrast, serializable snapshot isolation is an optimistic concurrency control technique. Instead of blocking if something potentially dangerous happens, transactions continue anyway, in the hope that everything will turn out all right. When a transaction wants to commit, the database checks whether anything bad happened; if so, the transaction is aborted and has to be tried. Only transactions that executed serializably are allowed to commit.