DBMS Flashcards
DBMS Architecture
SQL commands
|
Optimizer
|
Access method manager
|
Buffer manager — Concurrency control, Reliability Manager
|
Database (Index files, Data files)
— Reliability Manager
Optimizer
Generates execution strategy for a query
Access Method Manager
Implements execution strategy by performing physical access to data
Buffer Manager
Manages page/block transfer from disk to main memory and vice versa
Concurrency Control
Manages concurrent access to data
Reliability Manager
Guarantees content correctness after crashes
Rollback
The database goes back to the state at the beginning of the transaction, before the error
Commit
Correct end of a transaction
ACID properties
Atomicity
Consistency
Isolation
Durability
Atomicity
Transaction must be completed and not left at intermediate states of execution.
- Redo
- Undo
Consistency
Transaction should not violate integrity constrains enforced by a database schema
- Rollback
- Automatically correct the violation
Isolation
Transactions must not interfere.
- Concurrency control
Durability
Data not lost on failures
- Reliability manager
Fix primitive
Loads pages into the buffer.
If PAGE in buffer:
return Page Address
Else:
If free page available:
Store page in available space
Else:
Search for count=0
if dirty == 0: write on disk
else: store page in that space
Force primitive
Write ops immediately propagated to disk
Flush primitive
Write ops postponed until free buffer
Heap File
New records are inserted at the end. All space is exploited.
Frequent for unclustered (secondary) indices
Ordered Sequential Structures
Records inserted by sort key. Usually free space is left in every block to add transactions in an ordered way.
Frequent for clustered (primary) indices
Primary (clustered) Indices
- Ordered Data structures
- Only one index of this type can be defined
- Clustered B Tree Index, Hash Index
Secondary (Unclustered) Indices
- Unordered Hash data structures
- Multiple different indices can be defined
- B Tree, Bitmap, Hash
B Tree Index Use Cases
Range Queries
Hash Index Use Cases
Equality predicates
Bitmap Index Use
Boolean expressions
Access Paths
- Full table scans
- Index scans
- RowID scans
Join Types
- Nested Loop
- Merge Scan Join
- Hash Join
- Bitmapped Join
Nested Loop Use Cases
Small inner table (<10^3) + Big outer table
Merge Scan Use Cases
Requires sorting.
Useful when we have a previous or following sort operation.
Hash Join
Useful for large DS
Bitmapped Join
Useful for joining multiple tables into a big central table.
Group By Types
- Sort Based: GB sort, GB no sort
- Hash Based: GB hash, GB no hash
Selectivity
Fractions of rows selected from the original dataset
Cardinality
Number of rows
Optimization Exercise Steps
- Writing algebra tree
- Compute cardinalities
- Select access paths without indexes
- Select Join and GB Types
- Create Indexes and new access paths
- Evaluate possible GB anticipations
Big / Small table threshold
10^3
Good / Bad selectivity threshold
1/10
Full Table Scans Use Cases
- No index
- Retrieval of large portion of the original table
- Full table + Filter
Index Scan Types
- Index Unique Scans: Return one rowID. Useful for unique values.
- Index Range Scans: Data returned in ascending order. Avoid sorting for GB with same attribute.
- Index Full Scans: Avoids sorting for GB with same attribute.
- Fast Full Index Scans: Data is not ordered.
- Bitmap Indexes
Bitmap Indexes Use Cases
For queries containing multiple WHERE clauses
RowID Scan Use Cases
Retrieving a single row
Serial Schedule
The actions of each transaction appear in sequence (T0 -> T1 -> T2)
Schedule Equivalence Types
- View Equivalence
- Conflict Equivalence
- 2 phase locking
- Timestamp equivalence
Serializable Schedule
Schedule that is equivalent to an arbitrary SERIAL schedule of the same transactions
View Equivalence / Serializable
Procedure:
1. Detect read-froms and final writes
Equivalence:
- Same reads-from set
- Same final write set
Conflict Equivalent / Serializable
Procedure:
1. Aj(x) is in conflict with Ai(x) if
- (I!=j)
- (x==x)
- RW, WR or WW conflict
- No other W(x) between them
Equivalence:
- Same conflict set
- Each conflict pair is in the same order
Serializable:
1. Create node graph with a node for each transaction
2. Draw arrows for Actions (from first action to last one)
3. If the graph is not cyclic then the schedule is serializable.
2 Phase Locking