Basic Model of Locality Flashcards

Question 1

Q

In the memory hierarchy, what happens as we move from disk to processor?

Answer

A

The memory at each level of the hierarchy gets faster, but it also gets smaller

Question 2

Q

Describe the structure of the von Neumann model (it’s parts and their relationships)

Answer

A

The von Neumann model is composed of an infinite but slow memory, a small but fast memory, and a sequential processor. i.e., it’s like disk, cache, and processor.

The fast memory has a maximum storage of Z “words”.

The processor can only operate on data stored in the small, fast memory.

Question 3

Q

Describe the rules of the von Neumann model

Answer

A

1) Local storage rule: The processor can’t work with data unless it is stored fast memory.
2) Block-transfer rule: Transfers of data between slow and fast memory occur in blocks of L words.

Block-transfer implies that when you want to move one word x into fast memory, then you need to move the entire block that it belongs to. Therefore the other data stored in its block is going to affect efficiency (does the block have what you’ll need next, or do you need to bring in another block?). This is called “data alignment”.

Question 4

Q

What is data alignment?

Answer

A

When data that we want to load into fast memory together belongs to the same block, so it can all be loaded at once (I believe).

Question 5

Q

What are the costs in the von Neumann model?

Answer

A

1) Work: W(n) = the # of operations
2) Transfers: Q(n; Z, L) = # of L-sized slow-fast transfers (loads and stores)

n = input size
Z = max # of words stored in fast memory
L = # of words in a transfer block

Question 6

Q

What are our algorithmic design goals for a hierarchical memory system?

Answer

A

1) Work optimality (of course!): W(n) = BigTheta(W_star(n)), where W_star(n) is the work of the optimal sequential algorithm. i.e., maintain the work of the best sequential algo.

2) High computational intensity: Maximize I(n; Z, L) = W(n) / L * Q(n; Z, L), i.e. ops per word transferred to fast memory, i.e. data reuse of your algorithm.

Do not optimize intensity at the expense of work optimality!

Question 7

Q

What is W(n)?

Answer

A

Work as a function of input size

Question 8

Q

What is Q(n; Z, L)?

Answer

A

Q(n; Z, L) = # of L-sized slow-fast transfers (loads and stores)

Question 9

Q

What is Z?

Answer

A

Z = max # of words stored in fast memory

Question 10

Q

What is L?

Answer

A

L = # of words in a transfer block

Question 11

Q

Let tau be the time per operation. How much time does it take just to do computations (no transfers)?

Answer

A

T_comp = tau*W

Question 12

Q

Let alpha be the time to transfer one word. How much time does it take just to do transfers (no computations)?

Answer

A

T_mem = alpha * L * Q

Question 13

Q

Assuming perfect overlap between the time to compute and time to transfer data, what is the minimum execution time for program?

Answer

A

T >= max{T_comp, T_mem} = max{tau * W, alpha * L * Q} =
…tau * W * max{1, (alpha/tau)/(W/(L*Q))}

tau * W is the IDEAL running time (assumes data movement is free)
the max term is the PENALTY you pay for moving data
Note that W/(L * Q) = ops/word = I (intensity!)
alpha/tau is “machine balance”, we can replace this with B

So the penalty is max{1, B / I} and our lower bound is T >= tau* W *
max{1, B / I}.

Question 14

Q

What is machine balance?

Answer

A

B = alpha/tau is “machine balance”, the time per word divided by time per operation. It’s a ratio of parameters that depend on just the machine.

It tells you how many operations can execute in the time it takes to move a word of data.

Note that it has units of ops/word, just like intensity (I).

Question 15

Q

What is ideal running time, assuming data movement is free?

Answer

A

tau*W, i.e. time per operation * number of operations

Question 16

Q

What is tau?

Answer

Study These Flashcards

A

Time per operation

Question 17

Q

What is alpha?

Answer

Study These Flashcards

A

Time per word (transfers)

Question 18

Q

What is the maximum computation time (assuming no overlap between data movement and computation)?

Answer

Study These Flashcards

A

tau*w * (1 + B / I), i.e. the sum of computation time and data movement time rather than the max.

Question 19

Q

What is normalized performance?

Answer

Study These Flashcards

A

R := (tau * W_star) / T <= (W_star / W) * min{1, I / B}

This performance measure is inversely proportional to time, so higher values are better

Question 20

Q

What is intensity?

Answer

Study These Flashcards

A

I(n; Z, L) = W(n) / (L * Q(n; Z, L)), i.e. ops per word transferred to fast memory, i.e. data reuse of your algorithm.

Question 21

Q

How do you exploit memory hierarchy in algorithm design?

Answer

Study These Flashcards

A

Organize data accesses so that they maximize data reuse

Question 22

Q

What do you do to ensure your algorithm scales well to future memory hierarchies?

Answer

Study These Flashcards

A

Intensity should equal or exceed machine balance (I assume this is because increasing processor speed increases machine balance over time)

Basic Model of Locality Flashcards

(22 cards)