Efficiency, Indexing, Physical Design Flashcards

Question 1

Q

Why should we study efficiency

Answer

A

If databases and DBMS don’t run fast enough to be useful, then the point is missed entirely.

Question 2

Q

Main Memory (RAM)

Answer

A

Volatile, fast, small, and expensive

Question 3

Q

Secondary Memory (DISK)

Answer

A

Permanent, slow, big, and cheap

Question 4

Q

Main Memory Access Time

Answer

A

30ns (.3 x 10^-7 sec)

Question 5

Q

Disk Access Time

Answer

A

10ms (1 x 10^-2)

Only this cost (I/O) is calculated

Question 6

Q

Parts of the Disk

Answer

A

Read/Write Head
Actuator
Arm
Spindle
Track/Cylinder
Sector
Block
Platters

Question 7

Q

Spanned Representation

Answer

A

Where a record is split up between two blocks of disk memory

Question 8

Q

Unspanned Representation

Answer

A

Where a record exists on a single block on disk memory

Question 9

Q

Why not fill up a block with data?

Answer

A

We may want to leave space at the end of a block in case we need to insert new records.
Target is 80% filled.

Question 10

Q

File

Answer

A

A series of blocks linked by address pointers.

Question 11

Q

Seek Time

Answer

A

Time it takes to find a block on disk. Costs 3-8ms

Question 12

Q

Rotation Delay

Answer

A

Time it takes for the disk to rotate to a block. Costs 2-3ms

Question 13

Q

Transfer Time

Answer

A

Time it takes for I/O to deliver the data via data bus to Main Memory. Costs .5-1ms

Question 14

Q

LRU buffer management strategy

Answer

A

When we run out of buffer space and need to free some, the least recently used space will be overwritten.

Excellent for Merge Joins
Kills Nested loop joins

Question 15

Q

Heap

Answer

A

Unsorted file of data.

Question 16

Q

Heap Lookup Time

Answer

A

N/2 | N = # data blocks
(ex. 200000 / 2 * .01s = 16.6 min)

Sometimes we’re lucky and the target is in the first location.
Sometimes we’re unlucky and its in the last location.

Question 17

Q

Sorted Binary Search

Answer

A

Inspect records halfway through file and determine if we should search before or after this record. Repeat until record is found.

Question 18

Q

Sorted Linear Search Time

Answer

A

N/2 | N = # data blocks

ex 200000 / 2 * .01s = 16.6 min

Question 19

Q

Sorted Binary Search Time

Answer

A

log2(N) | N = # data blocks

ex log2(200000) * .01s = 18 * .01s = .18s

Question 20

Q

Primary Index

Answer

A

Create a copy of the key of a record into a new block, and create a pointer to the address of the block of data the key belongs to.

Block design:
K = Key
P = Pointer
+—–+——+——+——+——+——+
| K1 | P1 | K2 | P2 | K3 | P3 |
+—–+——+——+——+——+——+

Question 21

Q

Primary Index Lookup Time

Answer

A

log2(n) + 1 | n = # index blocks

ex: log2(n / fanout) + 1 | fanout = 60

Sparse (200,000 records):
(log2(200/000 / 60) + 1) * 0.01s = (12+1)*0.01s = 0.13s

Dense (4M):
(log2(4,000,000 / 60) + 1) * 0.01s = (16+1)*0.01s = 0.17s

Question 22

Q

Index Block Calculation

Answer

A

data blocks / fanout

Question 23

Q

Fanout

Answer

A

The number of overall blocks the total records fall into.

Question 24

Q

Sparse (Single Level) Index

Answer

A

index records are not created for every search key value.

Question 25

Q

Dense Index

Answer

A

an index record is created for every search key value in the database. Requires more space.

Question 26

Q

Multi Level Index

Answer

A

Where we have an index for the indices of our blocks of data.

Question 27

Q

Multi Level Lookup Time

Answer

A

log(fanout)(n) + 1 | n = # index blocks

Question 28

Q

Multi Level Index B-Tree

Answer

A

All data is at the bottom, and all nodes above that are index node.

If the tree deteriorates over time, so does the search time.
It’s very important that the distance from the root to the base level never change over time.

Question 29

Q

Static Hashing

Answer

A

Hashing function hashes keys from a keyspace which resolves an address in a designated address space. This address holds a pointer for a linked data block.

Question 30

Q

Properties of a Good Hash Function

Answer

A

1) Distribute values uniformly over the address space
2) Fill buckets as much as possible
3) Avoid collisions

Question 31

Q

Properties of Good Static Hashing

Answer

A

Linked data blocks are uniform, address space is appropriate size so linked lists don’t get too deep.

Question 32

Q

Blocking Factor

Answer

A

Number of blocks in a file