Storing Data - Disks and Files Flashcards
What does the physical scheme describe?
Files and Indexes used
What is the basic abstraction of data in DBMS? How are they located?
A file of records; a record ID can locate a record in a file
What is file organization and what are 3 types?
A method of arranging a file of records on external storage
Heap Files, Sorted Files, Indexes
What are the two types of external storage and the differences between them?
Disks: Random access at fixed coast; cheaper for block I/O
Tapes: only read pages in sequence, cheaper
What are the 3 components of the architecture?
- File and index layers make calls to buffer manager
- Buffer manager stages pages from external storage to main memory buffer pool
- Disk Manager keeps track of pages used by files, and empty slots on pages
What is a heap file and when is it used?
Random ordering file suitable when typical access is a file scan retrieving records
What is a sorted file and when is it used?
Best if records must be retrieved in some order, or when a range of records is needed; file sorted on some value
What is an index ?
A data structure used to organize records via trees or hashing
- speeds up searches based on key value; faster updates in index
What is the main advantage of disk or tape?
Random access instead of sequential
In what units are data stored and retrieved on disk?
Blocks Or Pages
Why doesn’t DBMS use OS disk manager?
- portability
- too small files
- OS files cannot span disk devices
What is a buffer pool?
Collection of pages in main memory
What is a frame?
Page in a buffer pool
What is “pinning”
Locking/increment a counter for memory waits/accesses
What is sequential flooding?
Using LRU + repeating sequential scans causing significant page faults; when buffer frames
What are DBMS requirements for buffer management?
1- pin a page in buffer pool 2- force a page to disk 3- adjust replacement policy 4- pre-fetch pages 3/4 based on access patterns
When would a record be variable length?
When using varchar
How are variable length records structured?
- delimeters
2- off-set based using directory
What complications of variable-length records?
modifying a field may cause it to grow; need to shift; may no longer fit on page; may not fit on one page
What are 2 options for page formats of fixed length records
1- packed; all used packed at top; use number of records to find bottom; if delete, need to shfit
2- unpacked,bitmap; use bitmap to find out if record is empty or used
What is the page format for variable length record?
directory of pointers for each slot
How does a heap file work?
- simple file structure that contain records in no order
- as file grows/shrinks, disk parks are allocated/deallocated
- keeps of pages in a file, free space on pages, and records on pages
What are 2 implementations of heap file and what are the pros/cons
- Double-linked list; header page hold pointers to data page doubly linked list and free pages DLL; expensive for insertion
- page directory: holds pointers to all data pages; each directory entry identifies a page, and free space management
What is the structure of an idnex fiel?
Leaf pages contain data entries and chained (prev/next)
non-leaf pages have index entries used to direct searches based on key
What is a system catalog?
Relation itself that holds meta data including index structure, relation names, attributes, etc
What are the two main techniques of RAID?
Data striping: partitioned for performance
Redundancy for backup
What are 4 RAID levels and the reasons for them
Level 0: data loss not an issue
3: workloads with main large transfer request of several contigous block
5: good general purpose
6. high reliability