Storage Flashcards
1
Q
Hard Disk
A
- Persistent storage; all data is preserved when the power is turned off
2
Q
Disk Inefficiency
A
- In the time it takes to access the disk, the processor can execute millions of instructions
3
Q
Char vs Varchar
A
- Char: a fixed-length string of exactly n characters, padded with spaces as needed
- Varchar: a variable-length string of up to n characters with no padding
- Values will be truncated if they are too long
4
Q
Fixed-Length Records
A
- Always allocate the maximum possible length
- Use a delimiter if the value is shorter than the max
- Every record has the same length, and fields are in a consistent location
- Potential waste of space
5
Q
Variable-Length Records
A
- Only allocate the space that each record needs
- Option 1: terminate field values with a special delimiter character
- Option 2: precede each field by its length
- Option 3: put offsets and other metadata in a record header (-1 for NULL), and include an offset for the end of the record
6
Q
B-Tree (Order m)
A
- Each node has between m and 2m items
- Each internal node has between m + 1 and 2m + 1 children (except the root node)
- Has a perfect balance
7
Q
Search in B-Trees
A
- Never need to search more than one of the subtrees of a node
8
Q
Insertion in B-Trees
A
- Search for k until you reach a leaf node
- If the leaf node has fewer than 2m items, add the new item to the leaf node, else split up the node, and insert the middle key into the parent
9
Q
B-Tree Efficiency
A
- # of nodes accessed <= tree height + 1
- Tree height <= log_{m}(n) where n is the number of items
- Search and insertion are O(log_{m}(n))
10
Q
B+Tree (Order m)
A
- Data items are only found in the leaf nodes
- Shorter tree requires less disk I/O
- Leaves are linked together to improve efficiency
11
Q
Search in B+Trees
A
- Keep searching until a leaf node is reached, even if the key is seen in an internal node
12
Q
Inserting in B+Trees
A
- Search for k until we reach a leaf node, even if k is an internal key
- If the leaf node has more than 2m items, split the leaf node
- Put the first m items in the left node split, and the remaining m + 1 items in the right node split (including the middle item)
- Copy the key of the middle item into the parent
13
Q
Deletion in B-Trees and B+Trees
A
- Search for the item and remove it
- When a node has fewer than m items, if a sibling node has more than m items, take items from it, otherwise merge it with the sibling
- If the key of the removed item is in an internal node, don’t remove the node
14
Q
Static Hashing
A
- The number of buckets never changes
- Use overflow buckets when full
15
Q
Simple Dynamic Hashing
A
- When the hash table gets too full, double the number of buckets and rehash all existing items