Metrics and Evaluation Flashcards

Question 1

Q

Latency

Answer

A

The time it takes to get from starting a task to finishing it.

Question 2

Q

Throughput

Answer

A

How many things we can do per second. (or over some time period?)

Question 3

Q

Throughput = 1 / Latency ?

Answer

A

Not always! Pipelining, for example, can significantly improve throughput over 1 / Latency

Question 4

Q

Two metrics of performance

Answer

A

Latency and throughput

Question 5

Q

Speedup

Answer

A

Speedup = N where “X is N times faster than Y”

i.e., N = Speed(X) / Speed(Y) = Throughput(X) / Throughput(Y)

For latency, N = Latency(Y)/Lateny(X)

Question 6

Q

Performance as a function of latency and throughput

Answer

A

Performance is proportional to throughput, and proportional to 1/Latency

Question 7

Q

What are the types of benchmarks?

Answer

A

Real applications
Kernels
Synthetic benchmarks
Peak performance

Question 8

Q

Pros and cons of real applications for benchmarking?

Answer

A

Pro: Most representative
Con: Most difficult to set up, you need an OS, graphics cards, etc, and these might not all be set up

Question 9

Q

What is a benchmark?

Answer

A

A set of programs and input data agreed upon for performance measurement

Question 10

Q

What is an application kernel (re: benchmarking)? Pros and cons?

Answer

A

Only the most time-consuming part of an application. Pro: Better than using real applications
Con: but you still might be missing things that are necessary for running the kernel, i.e., a compiler

Question 11

Q

What is a synthetic benchmark?

Answer

A

Designed to emulate an application kernel, but is simpler to compile

Question 12

Q

What is peak performance (re: benchmarking)?

Answer

A

At best, how many instructions per second? This is really only good for marketing!

Question 13

Q

When should we use each type of benchmark?

Answer

A

When designing a new machine, synthetic benchmarks. We can run them at the design stage because they have minimal requirements, enabling us to choose the best design. However, they’re not great for reporting performance to others.

Once a prototype is built, we can use kernels!

Once we have a real machine that we are trying to sell someone, we should use real applications.

Question 14

Q

How do you compute the average of several speedup calculations? How do you NOT compute it?

Answer

A

Do NOT use the arithmetic mean! This is because speedups are ratios.

Say we get our speedup of 2 for one app because throughput changes from 2 to 4 and a speedup of 8 for another app because throughput changes from 1 to 8. Ideally we would calculate average speedup directly from these throughputs like this:

(8 + 4) / (1 + 2) = 12/3 = 4

Note this is NOT equal to the average of average speedups, (2 + 8) / 2 = 5.

If we only have the average speedups, we can use their geometric mean, i.e., for n speedups, the geometric mean = the nth root of the product of all n speedups. Geometric mean of 2 and 8 is sqrt(2*8) = 4.

We can use it because the speedup of geometric means of throughputs = the geometric mean of speedups.

sqrt(48)/sqrt(21) = 4 = sqrt(2*8)

Question 15

Q

Iron Law of Performance

Answer

A

CPU Time = # of instructions in program * CPI * clock cycle time, because:

instructions/program * cycles/instruction * seconds/cycle = seconds/program (after canceling cycles and instructions)

Question 16

Q

What determines the values of each component in the Iron Law of Performance/CPU time?

Answer

A

Number of instructions in program: algorithm, compiler, instruction set

CPI: instruction set, processor design

clock cycle time: processor design, circuit design, transistor physics

Question 17

Q

How does computer architecture affect the Iron Law of Performance/CPU time?

Answer

A

Via instruction set and processor design.

Instruction set: more complex instructions will reduce # of instructions, but increase CPI. Vice versa for simple instructions.

Processor design: We can have processor with short clock cycle at the expense of greater CPI, or longer clock cycle with lower CPI.

A good design balances these tradeoffs.

Question 18

Q

Iron Law for Unequal Instruction Times (different CPIs for different instruction types)

Answer

A

We need to combine instructions/program and cycles/instruction into:

The sum over all instructions of (the number of instructions of each type) * (CPI for that instruction type)

Question 19

Q

Amdahl’s Law

Answer

A

We use this to calculate overall speedup when only some instructions are affected by a change.

overall speedup = 1/ ((1-FRAC_enh) + FRAC_enh/speedup_enh)

where FRAC_enh is percentage of execution TIME affected by the enhancement. TIME ONLY, not # of instructions.

Question 20

Q

Implications of Amdahl’s Law

Answer

A

It’s usually better to have a small speedup on a larger percentage of execution time than a large speedup on a small percentage, i.e. MAKE THE COMMON CASE FAST!

Imagine an infinite speedup on a small portion of execution time to understand why (there is a hard cap on overall speedup)

Question 21

Q

Lhadma’s Law

Answer

A

While Amdahl tells us to make the common case fast, we shouldn’t mess up the uncommon case too much!

While percent of execution time caps the benefit of any speedup, there is no such floor for slowdowns! If you slow down a small portion enough, you can make the program take arbitrarily long.

Question 22

Q

What happens if we keep speeding up the same part of program?

Answer

A

Diminishing returns! After each improvement, that portion of the program becomes a smaller percentage of the total, so the overall speedups become smaller.