Databases Flashcards
What is a transaction?
Transaction: A logical unit of work that must be entirely completed or aborted
List and explain the transaction properties
- Atomicity – Ensures a transaction is treated as a single, logical unit of work. Either all of its operations are completed, or none are. If any part of the transaction fails, the entire transaction is rolled back.
- Consistency - A transaction should bring the database from one consistent state to another. It means that the integrity constraints and business rules must be satisfied before and after the transaction.
- Isolation - A database transaction property in which a data item used by one transaction is not available to other transactions until the first one ends.
- Durability - The transaction property that ensures that once transaction changes are done and committed, they cannot be undone or lost, even in the event of a system failure.
- Serializability - Means that a series of concurrent transactions will yield the same result as if they were executed in Serie (one after another).
Explain the function and use of a transaction log
Transaction log: Records all operations that modify the database.
The database transaction log plays a crucial role in maintaining database concurrency control and integrity
What is concurrency control and what is it’s objective?
Concurrency control: Coordination of the simultaneous execution of transactions in a multiuser database system
Objective: Ensure serialisability of transactions to guarantee data integrity and consistency in a database management system.
List and explain data integrity and consistency problems (3)
- Lost updates: Occurs in two concurrent transactions when: Same data element is updated and one of the updates is lost.
- Uncommitted data: Occurs when 2 transactions are executed concurrently, first transaction is rolled-back after the second transaction has already accessed uncommitted data.
- Inconsistent retrievals: Occurs when a transaction accesses data before and after one or more other transactions finish working with such data.
What is locking methods
Locking methods: Involves placing locks on database objects (e.g., rows, tables) to prevent concurrent transactions from accessing or modifying the same data simultaneously.
Explain different lock types (3)
- Binary lock: Has only two states, locked (1) and unlocked (0). If a data item is locked, no other transaction can use that data item.
- Exclusive lock: Issued when one or more transactions must update (WRITE) a data item. Therefore, an exclusive lock is issued only when a transaction must WRITE (update) a data item and no locks (not shared or exclusive) are currently held on that data item by any other transaction.
- Shared lock: Issued when a transaction must read data from the database and no exclusive locks are held on the data to be read.
Explain 2PL
2PL: A set of rules that manage how transactions acquire and release locks to ensure serializability, though it doesn’t prevent deadlocks.
Explain the 2PL phases
- Growing phase: Transaction acquires all required locks without unlocking any data
- Shrinking phase: - Transaction releases all locks and cannot obtain any new lock
What is the write-ahead log protocol
In concurrency control, a process that ensures transaction logs are written to permanent storage before any database data is actually updated.
Discuss basic database performance tuning concepts
SQL performance tuning – on the client side – that will generate an SQL query to return the correct answer in the least amount of time, using the minimum amount of resources at the server end.
DBMS performance tuning – on the server side – that will properly configure the DBMS environment to respond to clients’ requests in the fastest way possible, while making optimum use of existing resources
Explain how the DBMS processes SQL queries
- Parsing - The DBMS parses the SQL query and chooses the most efficient access/execution plan.
- Execution - The DBMS executes the SQL query using the chosen execution plan.
- Fetching - The DBMS fetches the data and sends the result set back to the client.
Explain the role of indexes in speeding up data access
Indexes: Used to speed up data access by creating ordered sets of values that contain index keys and pointers to actual table rows.
Data sparsity: It’s the variety of unique values in a column. Helps determine whether to use an index.
Low sparsity columns – Less suitable for indexing
High sparsity columns – Benefits from indexing
Differentiate between centralised and distributed DBMS
- Centralized DBMS: Data is stored in a single location, allowing for straightforward management and control.
- Distributed DBMS (DDBMS): Data is distributed across multiple locations, which may enhance reliability and performance by reducing the risk of a single point of failure.
What is a fully distributed DBMS?
- A fully distributed system spreads both data and processing tasks across multiple sites, optimizing resource utilization and improving access speed for users.