Papers for Scheduling, Memory Management, IPC, and Sync Flashcards

1
Q

What is the premise of Fedorova’s paper on scheduling for chip multi processors?

A

That scheduling can have a high impact on CPU performance and can sometimes even counteract the benefits of smart chips

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Where does Fedorova consider the bottleneck to be for CPU performance?

A

contention for processor pipeline because that is resource that single vs multi-threaded processors are competing for

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What types of threads should be scheduled on the same CPU according to Fedorova?

A

Threads that won’t compete for the same resources as each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Instruction delay latency?

A

delay between when an instruction starts and when it is completed (i.e. how long is the thread blocked by an instruction). Can be used to approximate processor pipeline performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we approximate the instruction delay latency?

A

Use the mean cycles-per-instruction (CPI), which is already supported by the hardware. High CPI == higher resource usage, low CPI == lower resource usage because that means that thread is blocked a lot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the outcome of Fedorova paper?

A

That we should schedule threads with high and low CPIs together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the test_and_set implementation?

A

while(test_and_set(lock) = BUSY) {}. (+) good latency and low delay, but (-) lots of contention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the test_and_test_and_set implementation?

A

While((lock == busy) or (test_and_set(lock) == busy));

This is better because we spin on a cache lookup instead of a main memory lookup == better contention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the spin lock delay implementation?

A

An improvement on test_and_test_and_set. Set a delay in the while loop. This helps decrease contention, but causes more delay. Can have static or dynamic delay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a queuing lock?

A

waiting threads are placed in a queue. Requires more memory and has lower latency, but low delay and has best contention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the considerations for spin lock performance according to Anderson?

A
  1. Want to minimize the communication from spinning processors on the memory buses
  2. Want to minimize delay between releasing a lock and a new thread acquiring it
  3. Want to minimize the time it takes to acquire a lock when it is free (don’t want lots of checking to accomplish 1 and 2)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is quiescence?

A

time it takes for all of the waiting processors to each try their test_and_set when a lock is released

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the main problem with test_and_set according to Anderson?

A
  • Lock is released -> invalidates caches
    • Each processor tries a test_and_set at the same time–> creates a queue of test_and_set operations that are waiting to happen
    • Each test_and_set invalidates every cache
    • Processors cannot unqueue their test_and_set operation
    • This is worse when the critical section is short
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the hardware improvements that Anderson mentions for test_and_set?

A
  1. Provide parallel access to memory locations -> reduces test_and_set overhead
  2. Only invalidate cache if test_and_set changes the value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly