Storing Data - Disks and Files Flashcards

1
Q

What does the physical scheme describe?

A

Files and Indexes used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the basic abstraction of data in DBMS? How are they located?

A

A file of records; a record ID can locate a record in a file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is file organization and what are 3 types?

A

A method of arranging a file of records on external storage

Heap Files, Sorted Files, Indexes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the two types of external storage and the differences between them?

A

Disks: Random access at fixed coast; cheaper for block I/O
Tapes: only read pages in sequence, cheaper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 3 components of the architecture?

A
  1. File and index layers make calls to buffer manager
  2. Buffer manager stages pages from external storage to main memory buffer pool
  3. Disk Manager keeps track of pages used by files, and empty slots on pages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a heap file and when is it used?

A

Random ordering file suitable when typical access is a file scan retrieving records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a sorted file and when is it used?

A

Best if records must be retrieved in some order, or when a range of records is needed; file sorted on some value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an index ?

A

A data structure used to organize records via trees or hashing
- speeds up searches based on key value; faster updates in index

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the main advantage of disk or tape?

A

Random access instead of sequential

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In what units are data stored and retrieved on disk?

A

Blocks Or Pages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why doesn’t DBMS use OS disk manager?

A
  1. portability
  2. too small files
  3. OS files cannot span disk devices
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a buffer pool?

A

Collection of pages in main memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a frame?

A

Page in a buffer pool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is “pinning”

A

Locking/increment a counter for memory waits/accesses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is sequential flooding?

A

Using LRU + repeating sequential scans causing significant page faults; when buffer frames

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are DBMS requirements for buffer management?

A
1- pin a page in buffer pool
2- force a page to disk
3- adjust replacement policy
4- pre-fetch pages 
3/4 based on access patterns
17
Q

When would a record be variable length?

A

When using varchar

18
Q

How are variable length records structured?

A
  1. delimeters

2- off-set based using directory

19
Q

What complications of variable-length records?

A

modifying a field may cause it to grow; need to shift; may no longer fit on page; may not fit on one page

20
Q

What are 2 options for page formats of fixed length records

A

1- packed; all used packed at top; use number of records to find bottom; if delete, need to shfit
2- unpacked,bitmap; use bitmap to find out if record is empty or used

21
Q

What is the page format for variable length record?

A

directory of pointers for each slot

22
Q

How does a heap file work?

A
  1. simple file structure that contain records in no order
  2. as file grows/shrinks, disk parks are allocated/deallocated
  3. keeps of pages in a file, free space on pages, and records on pages
23
Q

What are 2 implementations of heap file and what are the pros/cons

A
  1. Double-linked list; header page hold pointers to data page doubly linked list and free pages DLL; expensive for insertion
  2. page directory: holds pointers to all data pages; each directory entry identifies a page, and free space management
24
Q

What is the structure of an idnex fiel?

A

Leaf pages contain data entries and chained (prev/next)

non-leaf pages have index entries used to direct searches based on key

25
Q

What is a system catalog?

A

Relation itself that holds meta data including index structure, relation names, attributes, etc

26
Q

What are the two main techniques of RAID?

A

Data striping: partitioned for performance

Redundancy for backup

27
Q

What are 4 RAID levels and the reasons for them

A

Level 0: data loss not an issue

3: workloads with main large transfer request of several contigous block
5: good general purpose
6. high reliability