M362 - Unit 6 Flashcards
Compare the concept of a transaction with that of a coarse-grained atomic action. What are their differences and similarities?
Both consist of several operations which are treated as belonging together - hence both are atomic. Both give the appearance of indivisibility. The difference is that a transaction also has the property of failure atomicity, so if there is a failure part-way through execution, all the partly done operations are undone so that it is as if nothing has happened, and if the transaction completes, the results of the transaction are made durable (stored in persistent storage).
What are the desirable properties for transactions?
There are four desirable (ACID) properties: atomicity, consistency, isolation and durability.
What are the transaction states in the transaction state model?
The transaction model specifies the following transaction states: Active, Failed, Partially Committed, Committed and Aborted.
When a transaction is in the Partially Committed state, it has not fully completed. What are the possible next states? Describe the circumstances under which it would progress to each of them.
When a transaction is Partially Committed it can progress to the Committed state in the case where all the data involved in the transaction is written to disk and made durable. On the other hand, if there is a problem in making the data durable, the transaction is said to have Failed. The effects of the transaction are then undone, i.e. rolled back, and only then is the transaction completely finished when it enters the Aborted state.
What is the role of a transaction-processing (TP) system?
The task of the TP system is to manage the correct and efficient execution of transactions.
What is a serial schedule?
A serial schedule is one in which transactions execute strictly one after the other, without any interleaving of operations.
Give the definition of serialisability.
A schedule for a group of transactions is serialisable if, and only if, it produces the same results as if the transactions had executed in some serial order.
When executing a number of transactions, several serial schedules may be possible. These serial schedules do not necessarily all give the same final results for the data objects involved. Explain whether or not you think this is problematic.
This is not problematic, as the main concern is whether the data objects are in a consistent state.
What are conflicting operations?
Operations conflict if the order in which they are performed affects the result.
How is the notion of conflicting operations used to determine whether a schedule is serialisable?
The notion of conflicting operation allows us to determine in what order transactions execute a pair of conflicting operations. Then, if for all the pairs of conflicting operations in a schedule the order of execution by transactions is the same, we know that the schedule is equivalent to a serial schedule.
Precedence graphs can be used as a notation for transactions and the conflicting operations of those transactions. Explain under which circumstances such graphs could contain a cycle.
A precedence graph contains a cycle if the schedule it represents is not serialisable.
What are the two phases of two-phase locking?
The two phases are acquire and release. In the acquire phase, the transaction gradually gets hold of all the locks it needs, as and when it needs them, but does not release any locks. In release, the transaction acquires no more locks but lets go of the ones it holds when it has completed work on the locked objects.
What is a shared lock and why might we want to use shared locks?
A shared lock is useful when non-conflicting operations are using the same object, and it is therefore safe for both to use the object. If a shared lock can be used, it increases concurrency.
What is a simple alternative to deadlock detection?
Timeouts.
What are cascading aborts and what causes them?
Cascading aborts occur when one transaction has to abort and this leads to other transactions having to abort. This is bad for the performance of a system, as much work may have to be repeated.
What are the advantages and disadvantages of time-stamp ordering (TSO)?
The advantages of TSO are: simple and efficient implementation; concurrency control data is held with individual objects and not centrally; objects are not locked for longer than the duration of an operation and so circular waits and hence deadlocks cannot occur.
The main disadvantages of TSO are:
the loss of flexibility, in that it imposes one serial schedule and excludes other possibilities; it is more susceptible to transaction aborts than other approaches, due to lack of isolation, and may possibly lead to cascading aborts.
Why is the optimistic concurrency control approach so named?
This approach is called optimistic because it expects success, rather than failure. That is, it works away, confident that things will probably turn out for the best, and only performs a final check at the very end.
Distinguish between volatile memory, persistent storage and stable storage.
Volatile memory holds data that would be lost if the device is switched off (see Section 2), whereas data stored in persistent storage remains even when the device is switched off, or the program that created the data stops executing. Stable storage is even more secure, because data is stored in several places. Therefore, if a disk fails in one location, usually the data survives in a different location.
For crash resilience, what must be done when a transaction aborts?
For a transaction, when a crash occurs, it must be as though the transaction had never been invoked. Therefore all the effects of an aborted transaction must be undone before it can be restarted.
Which of the ACID properties of transactions are addressed through the provision of crash resilience?
(Failure) atomicity and durability. Transactions have to be all or nothing, i.e. failure atomic, and therefore crash-resilience mechanisms are put in place to ensure that a half-completed transaction can be undone should a crash occur. A crash-resilience mechanism also ensures that, if the transaction had committed, its results are definitely stored in persistent store, i.e. durability.
What is meant by rolling back?
Rolling back means restoring the persistent store to the state that existed before the start of a transaction that was aborted due to a system crash.
In logging, what might happen if log information is written to the persistent store after the data updated by the atomic operation is written there?
In logging, the information in the log is used to roll back the persistent store to its state at the start of the transaction in which the crash occurred, by using the old values recorded in the log. If a crash occurs after the data updated by the transaction is written to the permanent store but before the log information is written there (i.e. it is not a write-ahead log), then there will be no record of the original state to facilitate rollback.
What is shadowing? What essential feature is required to ensure that shadowing is implemented successfully?
Shadowing is where the results of a transaction are built up in a structure that mirrors part of the persistent store but the persistent store is not updated until the transaction commits. Once the transaction has committed, the updated copy replaces the relevant part of the persistent store.
The essential feature of shadowing is that the replacement of the relevant part of the persistent store by the updated copy should occur in a single operation (e.g. by setting a single pointer). This avoids problems that would occur if a crash happened during the replacement.
What does the term relation refer to in relational database models?
Relation refers to the logical grouping of the data, which is often represented as a table, with rows and columns. Each column contains data of a certain type, and each row, also known as a record, contains the details for an object in the table. The relation is the table as a whole.