Storage Flashcards

Question 1

Q

Hard Disk

Answer

A

Persistent storage; all data is preserved when the power is turned off

Question 2

Q

Disk Inefficiency

Answer

A

In the time it takes to access the disk, the processor can execute millions of instructions

Question 3

Q

Char vs Varchar

Answer

A

Char: a fixed-length string of exactly n characters, padded with spaces as needed
Varchar: a variable-length string of up to n characters with no padding
Values will be truncated if they are too long

Question 4

Q

Fixed-Length Records

Answer

A

Always allocate the maximum possible length
Use a delimiter if the value is shorter than the max
Every record has the same length, and fields are in a consistent location
Potential waste of space

Question 5

Q

Variable-Length Records

Answer

A

Only allocate the space that each record needs
Option 1: terminate field values with a special delimiter character
Option 2: precede each field by its length
Option 3: put offsets and other metadata in a record header (-1 for NULL), and include an offset for the end of the record

Question 6

Q

B-Tree (Order m)

Answer

A

Each node has between m and 2m items
Each internal node has between m + 1 and 2m + 1 children (except the root node)
Has a perfect balance

Question 7

Q

Search in B-Trees

Answer

A

Never need to search more than one of the subtrees of a node

Question 8

Q

Insertion in B-Trees

Answer

A

Search for k until you reach a leaf node
If the leaf node has fewer than 2m items, add the new item to the leaf node, else split up the node, and insert the middle key into the parent

Question 9

Q

B-Tree Efficiency

Answer

A

# of nodes accessed <= tree height + 1
Tree height <= log_{m}(n) where n is the number of items
Search and insertion are O(log_{m}(n))

Question 10

Q

B+Tree (Order m)

Answer

A

Data items are only found in the leaf nodes
Shorter tree requires less disk I/O
Leaves are linked together to improve efficiency

Question 11

Q

Search in B+Trees

Answer

A

Keep searching until a leaf node is reached, even if the key is seen in an internal node

Question 12

Q

Inserting in B+Trees

Answer

A

Search for k until we reach a leaf node, even if k is an internal key
If the leaf node has more than 2m items, split the leaf node
Put the first m items in the left node split, and the remaining m + 1 items in the right node split (including the middle item)
Copy the key of the middle item into the parent

Question 13

Q

Deletion in B-Trees and B+Trees

Answer

A

Search for the item and remove it
When a node has fewer than m items, if a sibling node has more than m items, take items from it, otherwise merge it with the sibling
If the key of the removed item is in an internal node, don’t remove the node

Question 14

Q

Static Hashing

Answer

A

The number of buckets never changes
Use overflow buckets when full

Question 15

Q

Simple Dynamic Hashing

Answer

A

When the hash table gets too full, double the number of buckets and rehash all existing items

Question 16

Q

Linear Hashing

Answer

Study These Flashcards

A

Treats the hash value as a binary number, using the i rightmost bits of that number
i = ceil(log_{2}(n))
If there is a bucket with the index given by the i rightmost bits put the key there, else use the bucket specified by the rightmost i - 1 bits

Question 17

Q

Linear Hashing - Adding a Bucket

Answer

Study These Flashcards

A

Add only one new bucket
If the new bucket’s binary index is 1xyz, only rehash the bucket with binary index 0xyz

Question 18

Q

Choosing an Index Structure

Answer

Study These Flashcards

A

If the set of frequently accessed pages can fit in the cache, use a B-tree/B+tree
If subsequent queries are for nearby keys, use a B-Tree/B+tree
If the set of frequently accessed pages is very large, use a hash table

Storage Flashcards

(18 cards)