13. Recovery Flashcards
Force/No Force policies - trade-off, goal
The force policy states when a transaction finishes, force all modified data pages to disk before the transaction commits.
Enforces durability but with bad performance (a lot of unnecessary writes).
The no-force polity states to only write back to disk when the page needs to be evicted from the buffer pool.
Performance is great, but enforcing durability is tricky (e.g. redo-logging implementation).
The goal is to implement a no-force policy with good durability.
Steal/No-Steal policies - trade-off, goal
The no-steal policy states that pages cannot be evicted from memory (and thus written to disk) until the transaction commits.
Ensures atomicity (no partial writes until commit), but handcuffs how we can use memory.
The goal is to implement a steal policy, which allows modified pages to be written to disk before a transaction finishes.
Ensuring atomicity becomes complex (e.g. implementing undo-logging to rollback incomplete transactions)
What is a database log?
A log is a sequence of log records that describe the operations that the database has done.
Log record types - describe
UPDATE: for SQL insert/delete/update
fields - {XID, pageID, offset, length, old data, new data}
COMMIT: signifies that a transaction is starting the commit process
ABORT: signifies that a transaction is starting the aborting process
END: signifies that a transaction is finished (usually means that it finished committing or aborting)
How log records are stored?
Log records are stored like DB records - combined into log pages. Log pages need to be operated on in memory but need to be written to disk to be stored permanently.
Write-Ahead Logging (WAL) - 2 rules
- Log records must be written to disk before the corresponding data page gets written to disk. This is how we will achieve atomicity. The intuition for this is that if a data page is written first and then the database crashes we have no way of undoing the operation because we don’t know what operation happened!
- All log records must be written to disk when a transaction commits. This is how we will achieve durability. The intuition is that we need to persistently track what operations a committed transaction has performed. Otherwise, we would have no idea what operations we need to redo. By writing all the logs to disk, we know exactly which operations we need to redo in the event that the database crashes before the modified data pages are written to disk!
What is LSN (for DB logging)?
What are pageLSN and flushedLSN?
Stands for Log Sequence Number. The LSN is a unique increasing number that helps signify the order of the operations (if you see a log record with LSN = 20 then that operation happened after a record with LSN = 10).
The flushedLSN is a counter in RAM that keeps track of the LSN of the last log record that has been flushed to disk.
The pageLSN is a piece of page metadata that stores the LSN of the operation that last modified the page.
Before page i is allowed to be flushed to disk, what inequality must hold? Why?
pageLSNi ___ flushedLSN
≤, This comes from our first rule for WAL - we must flush the corresponding log records before we can flush the data page to disk. A data page is only flushed to disk if the LSN of the last operation to modify it is less than or equal to the flushedLSN. In other words, before page i can be flushed to disk, the log records for all operations that have modified page i must have been flushed to disk.
How to abort a transaction using a write-ahead log?
The first thing we will do is write an ABORT record to the log to signify that we are starting the process.
Then we will start at the last operation in the log for that transaction. We will undo each operation in the transaction and write a CLR record to the log for each undone operation.
A CLR (Compensation Log Record) is a new type of record signifying that we are undoing a specific operation. It is essentially the same thing as an UPDATE record (it stores the previous state and the new state), but it tells us that this write operation happened due to an abort.
Fill in the equality below to enforce the WAL rule that all the logs must be flushed to disk before a transaction T can commit:
flushedLSN __ lastLSN of T
≥
If the flushedLSN is is greater than the last operation of the transaction then we know all of the logs for that transaction are on disk.
What is UNDO logging? Which recovery policy does it implement (if it’s the only recovery mechanism)?
Describe how it works
We undo the effects of all the transactions that have not yet been committed, while not doing so for those that have.
Implements FORCE-STEAL policy
On recovery, we run the recovery manager before accepting any new queries. We scan the log from the end to determine whether each of the transactions is completed or not. The action that we take based on the log record we encounter is as follows:
- COMMIT/ABORT T: mark T as completed
- UPDATE T, X, old, new: if T is not completed, write X=old to disk, else ignore
… proceed until the start of the log or the latest checkpoint (if we have them)
What is REDO logging? Which recovery policy does it implement (if it’s the only recovery mechanism)?
Describe the process
we redo the actions of all the transactions that were committed
NO-STEAL-NO-FORCE policy
On recovery, we run the recovery manager before accepting any new queries. We just read the log from the beginning (or from the latest checkpoint if we have them) and redo all updates of committed transactions.
ARIES Recovery Algorithm - main idea, 3 phases
When a database crashes, the only things it has access to are the logs that made it to disk and the data pages on disk. From this information, it should restore itself so that all committed transactions’ operations have persisted (durability) and all transactions that didn’t finish before the crash are properly undone (atomicity). The recovery algorithm consists of 3 phases that execute in the following order:
- Analysis Phase: reconstructs the Xact Table and the DPT (dirty page table)
- Redo Phase: repeats operations to ensure durability
- Undo Phase: undoes operations from transactions that were running during the crash to ensure atomicity
What is a transaction table in ARIES recovery protocol?
An in-memory data structure that stores information on the active transactions. The transaction table has three fields:
- XID: transaction ID
- status: either running, committing, or aborting
- lastLSN: the LSN of the most recent operation for this transaction
What is a DPT in ARIES recovery protocol?
Dirty Page Table. The DPT keeps track of what pages are dirty (recall from many modules ago that dirty means the page has been modified in memory, but has not been flushed to disk yet). This information will be useful because it will tell us what pages have operations that have not yet made it to disk.
The DPT only has two columns:
- Page ID
- recLSN: the first operation to dirty the page