Performance Flashcards

1
Q

what is latency?

A

times it takes to send and receive a request (amount of time it takes for the customer to order)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is throughput?

A

amount of data that can be successfully processed in a certain time (amount of time it takes to server a specific number of customers in a certain amount of time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

given different stages in a server what is the latency of a server?

A

sum of all stages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

given different stages in a server what is the throughput of a server?

A

min of throughput of stages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is utilization?

A

percentage of capacity that is being used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is overhead w/ examples?

A

the amount of resources that is “wasted” (usually is the use at lower-layers ex: allocation virtualization, operating systems)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is useful work?

A

resources spent on actual work
(usually the use at the current layer in a system ie the application)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the formula for overhead as percentage?

A

overhead = capacity - useful work
percentage = overhead / useful work

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

bottleneck

A

the component that restricts throughput

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

if you’re optimizing for throughput what should you focus on?

A

bottle neck

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

if you’re optimizing for latency what should you focus on?

A

the component with the highest latency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the number one rule for optimization and why?

A

dont prematurely optimize because we are bad at predicting performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is amdah’s law?

A

1/((1-p) + (p/s))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the best case speedup?

A

lim s->inf = 1/(1- p)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are the three solutions for latency?

A

fast-path, parallelize, speculation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is fast-path?

A

reduce latency for some requests

17
Q

how do we parallelize (concurrency) to help with latency?

A

run independent steps in parallel (if one step does a totally different step then another step than you can run them both at the same time)

18
Q

what is speculation and what does it cost?

A

predict what work might be done, if the prediction is correct latency goes down
- trading off work for latency, if the system is working near capacity you might be trading off throughput for latency

19
Q

what are the three solution for throughput?

A

batching, dallying, concurrency or pipelining

20
Q

what is batching?

A

grouping multiple tasks or inputs together and processing them as a single batch instead of one by one

21
Q

what is dallying?

A

delaying a request, removing unnecessary waiting

22
Q

what is pipelining?

A

give each stage its own threads/memory
connect stages using bounded buffers

23
Q

what is the working set?

A

its the last k number of distinct items that were accessed
( subset of data that is currently relevant to the program’s execution)

24
Q

what is the working set hypothesis?

A

for some number k the actual working set is much smaller than k

25
Q

what is temporal locality?

A

recently accessed items will be accessed again

26
Q

what is spatial locality?

A

items closed to recently accessed items will be accessed soon

27
Q

what are the 3 different associativity’s in caches and rank them from most to least expensive?

A

— most expensive —
1. fully associative : each item can go in any slot
2. n way associative : each item can go in n locations
3. direct mapped : each item can go in one slot
— least expensive —

28
Q

what are the two mechanisms for removing an item form the cache?

A
  1. write-back cache
  2. write through cache
29
Q

what is write back cache?

A

updates to the cache are not immediately reflected in the memory

30
Q

what is write through cache?

A

updates to the cache are immediately reflected in the memory

31
Q

what are the three policies for removing items from the cache?

A

FIFO, LRU, Clock

32
Q

explain FIFO

A

track count of actions and when each item was added

33
Q

explain LRU

A
  • the least recently used item is removed
  • tracks the age of each item
  • age is rest to zero when accessed
  • evicts the item with the highest age
34
Q

explain the clock algorithm

A

is like a clock with a hand ticking around. Each time an item is accessed, it’s as if the clock hand is ticking past that item. When the cache is full and a new item needs to be added, the clock hand “checks” each item. If an item has been recently accessed (referenced), it is spared. If it hasn’t been accessed since the last check, it is evicted, making room for the new item.

35
Q

what does belady’s anomaly state?

A

a larger cache does not always result in better performance

36
Q

what is a compulsory miss?

A

program has never accessed the data (how many unique items have been introduced?)

37
Q

what is capacity miss?

A

the program is actively using more data than a cache can hold (how many misses would a fully associative cache of my size have)

38
Q

conflict misses

A

a program has seen the data but it was evicted by something in that data’s set