P3L1: Scheduling Flashcards

1
Q

What is the role of the CPU scheduler?

A

The CPU scheduler decides how and when processes and threads access the CPU, i.e., how and when tasks are selected from the ready queue to run on the CPU.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Does the CPU scheduler schedule user tasks, kernel tasks, or both?

A

Both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are four reasons why a task might be added to the ready queue?

A
  1. An I/O operation the task has been waiting on has completed
  2. The task has been woken from a wait
  3. The task was just created, i.e., a new thread is created by forking
  4. The task was ready, ran on the CPU until its timeslice expired, then was interrupted and put back on ready queue
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When is the CPU scheduler run?

A

Whenever the CPU becomes idle. This may be because a task completed, or was interrupted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What’s another name for the ready queue?

A

Run queue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What happens when the CPU scheduler selects a task to be scheduled?

A

The task is dispatched to the CPU. This entails:

  1. Context switch, entering the context of new task
  2. Enter user mode
  3. Set the program counter
  4. Go!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are four good metrics for comparing CPU schedulers?

A
  1. Throughput (# of processes executed)
  2. Average job completion time
  3. Average job wait time
  4. CPU utilization (% of CPU resources used)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

For First-Come-First-Serve scheduling, what data structure should you use for the runqueue?

A

FIFO queue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

For Shortest-Job-First scheduling, what data structure should you use for the runqueue?

A

Ordered queue or a tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you calculate throughput?

A

jobs completed / time_to_complete_all_jobs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you calculate average completion time?

A

sum_of_times_to_complete_each_job / jobs_completed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you calculate average wait time?

A

sum_of_all_wait_times / jobs_completed

Wait time is the time from when process arrives until it begins executing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

For Shortest-Job-First scheduling, we need to estimate how long a job will take. What are two heuristics we might use?

A
  1. How long did the job take last time?

2. How long did the job take, on average, over some previous window of time (say, the last n times the job ran)?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is “Run to Completion” scheduling?

A

Simply when jobs run to completion, i.e., they cannot be interrupted/preempted by the scheduler

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Preemptive scheduling?

A

When the scheduler can interrupt/preempt processes to run a different processes instead.

For example, a long process arrives first and starts executing. A shorter process arrives later. A “Shortest Job First” scheduler will preempt the long process and run the short process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Priority Scheduling?

A

Scheduling based on some priority level. OS tasks may be given higher priority than user tasks, or interactive user tasks may be given higher priority than background tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are three data structures we could use to implement a runqueue when using a Priority Scheduler?

A
  1. Ordered queue
  2. Tree
  3. Multiple queues, with each queue assigned a priority level.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a danger of Priority Scheduling?

A

Starvation: Low priority tasks may never run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How do we protect against starvation when using a Priority Scheduler?

A

Priority aging: As tasks wait to execute, their priority levels increase. In this case, priority is a function of both the primary priority measure and the age of the task.

20
Q

What is Priority Inversion?

A

Priority Inversion happens when a low priority job locks a mutex that is needed by a higher priority task, but is preempted by other tasks of intermediate priority.

Say we have P1, P2, and P3 with priorities ordered like P3 < P2 < P1. Also assume P1 and P3 need the same lock to complete.

If P3 arrives first and gets the lock, then is preempted by P2 arriving, P3 doesn’t get to complete. But it still has the lock! So when P1 arrives and tries to get the lock, it’s blocked. So P2 completes, followed by P3. Only then does P1 get the lock and run.

So we wind up with jobs completing in the order P2, P3, P1 (when we would have liked them to complete in order of priority, P1, P2, P3).

21
Q

How do we prevent Priority Inversion?

A

We can prevent priority inversion by temporarily boosting the priority of mutex owner so that it doesn’t get preempted. This allows it to complete earlier and allows any higher priority jobs that need the mutex to also complete earlier.

22
Q

Describe three variants of Round Robin Scheduling

A
  • In all Round Robin Scheduling systems the process must be allowed to yield the CPU rather than being required to run to completion.
    1. Vanilla Round Robin: Tasks are scheduled first come first serve, but they don’t have to complete before yielding. They may instead yield when waiting, for example, on an I/O operation.
    2. Round Robin with Priority: Tasks are scheduled first come first serve unless a higher priority task arrives, in which case that is scheduled. This includes preemption.
    3. Round Robin with Interleaving: We give each task a timeslice in which it is allowed to run. When it reaches the end of the timeslice, it is interrupted and the next tasks starts.
23
Q

What is a timeslice?

A

Maximum amount of uninterrupted time given to a task (on the CPU). Also called “time quantum.”

24
Q

What do timeslices allow us to achieve?

A

Timeslices allow us to interleave CPU-bound tasks.

I/O-bound tasks are always giving up the CPU anyway, but for CPU-bound tasks we need the timeslice mechanism to interleave.

25
Q

What is a CPU-bound task?

A

A task whose completion time is primarily determined by the speed of the CPU. As opposed to IO-bound tasks, which often have to wait on IO processes.

26
Q

What are the pros and cons of Timeslice Scheduling?

A

Pros:

  • Short tasks finish sooner (like Shortest Job First)
  • More responsive
  • We can start long IO operations earlier and complete their corresponding tasks sooner

Cons:
- Overheads! Interrupting, scheduling, and context switching for a job all take time, and we’ll do these more often with timeslice scheduling.

However, as long as the timeslice length >= the time it takes to context switch, we can usually minimize overheads.

27
Q

What two concerns do we need to balance when deciding how long timeslices should be?

A

Starting jobs sooner (more responsive) vs. Overheads (context switching)

28
Q

Do CPU-bound tasks prefer a larger or smaller timeslice? Why?

A

Larger! We’re not as worried about responsiveness with CPU-bound tasks, so reducing wait time isn’t as much of a benefit. Larger timeslices minimize context-switching, which reduces average completion time and increases throughput.

29
Q

Do IO-bound tasks prefer larger or smaller timeslices?

A

IO bound tasks prefer a smaller timeslice. This allows IO operations to begin earlier, increasing throughput and reducing wait times.

30
Q

Describe the runqueue data structure we should use with Timeslice Scheduling

A

We want to use the Multi-Level Feedback Queue. This means:

  • We create multiple ready queues with different time slices, from shortest to longest.
  • Shorter time slices are for more IO bound tasks and longer timeslices for CPU bound tasks.
  • Jobs in the shortest timeslice queue run first!
  • We assign all new jobs to the shortest timeslice queue.
  • If a job uses its entire timeslice, we move it to the next queue, with a longer timesplice
  • If a job repeatedly yields the CPU before using its timeslice, we move it to the shorter timeslice queue

This structure uses the behavior of tasks in real time to discover which are more IO bound/CPU bound. Won a Turing Award. Kind of a big deal.

31
Q

Describe the Linux O(1) scheduler

A
  • A preemptive, priority-based scheduler that is able to select tasks in constant time regardless of task count
  • Tasks are organized into two categories: real-time and timesharing tasks. Real-time are higher priority and timesharing are lower. All User processes are timesharing tasks
  • Higher priority tasks get longer timeslices, lower priority get shorter timeslices
  • Uses SLEEP TIME to determine priority. Longer sleep implies interactive process, so longer sleep time gets jobs higher priority. Shorter sleep implies compute-intensive, so gets lower priority.
  • Maintains two arrays, one of Active Tasks and another of Expired Tasks. Each entry in an array is a list of tasks of the same priority.
  • Tasks are only moved from Active to Expired when they’ve consumed their entire timeslice.
  • Only tasks from Active array are scheduled
  • Once all Active tasks complete, the two arrays are switched.
32
Q

Why was the Linux O(1) scheduler replaced with the Completely Fair Scheduler?

A

Linux O(1) was replaced with CFS because workloads changed, beginning to include more time sensitive applications like streaming video. Linux O(1) wouldn’t run expired jobs until all active jobs had a chance to execute for their entire timeslice, causing jitter.

It also doesn’t make any fairness guarantees

33
Q

Intuitively describe “fairness” in CPU scheduling

A

Over a given time interval, all tasks should be able to run for an amount of time proportional to their priority.

34
Q

Describe the Linux CFS scheduler

A
  • Proposed to address fairness concerns in Linux O(1)
  • Default scheduler for non-Real Time tasks (Real Time Scheduler handles those)
  • Uses a Red-Black tree for runqueue data structure
  • Tasks are ordered in tree according to “virtual run time”, with lower run time to left and higher to right
  • “Virtual Run Time” passes more quickly for lower priority tasks, more slowly for high priority tasks
35
Q

Describe the algorithm used by Completely Fair Scheduling

A
  • Always pick the leftmost node (shortest virtual runtime)
  • While running, increment vruntime periodically and compare to leftmost vruntime
  • If smaller, keep running. If larger, preempt and place in tree based on vruntime
  • Start over

Notes:

  • vruntime progress rate depends on priority and “niceness”
  • Low priority increment faster, high priority more slowly
  • This means the high priority runs longer

Performance

  • Selecting a task is O(1)
  • Adding a task to tree is O(logN)
36
Q

What is a Red-Black tree?

A

A tree that rebalances itself whenever adding or removing nodes so that all the paths from the root to the leaves are roughly the same distance

37
Q

What is virtual runtime?

A

The amount of time a job spends running on the CPU

38
Q

What was the main reason the Linux O(1) scheduler was replaced with CFS?

A

Interactive tasks could wait unpredictable amounts of time to be scheduled

39
Q

What is cache affinity?

A

This is when we try to keep tasks on the same CPU as much as possible (which maintains a hot cache)

40
Q

How do we achieve cache affinity?

A

We get cache affinity by using a hierarchical scheduler architecture.

  • At the top level, a load balancer divides tasks among the CPUs
  • Then a per-CPU scheduler with a per-CPU runqueue repeatedly schedules tasks for that CPU
41
Q

How does the load balancer in a hierarchical scheduler choose to allocate tasks?

A
  • First uses the length of runqueues on each CPU

- Then, if some CPU is idle, checks other CPUs for work that can be moved to the idle one

42
Q

What is Non-Uniform Memory Access?

A

In systems with multiple memory nodes, each memory node might be associated with a particular CPU or cluster of CPUs. All CPUs will be able to access this node, but the associated CPU will be able to access it faster.

43
Q

What is NUMA-aware scheduling?

A

NUMA = Non-Uniform Memory Access

When tasks are bound to the CPU that is closest to the memory node that stores the state of those tasks.

44
Q

What is hyperthreading/simultaneous multithreading (SMT)?

A

CPUs can be built to have multiple sets of registers, each hosting a different thread. The threads still run one at a time, but context-switching between them is very cheap.

Also called “chip multithreading” and “hardware multithreading”.

45
Q

How does a CPU scheduler on a hyperthreading capable machine decide which threads to put on the same CPU?

A

By mixing compute-bound and memory-bound threads!

If we put compute-bound threads together, they have to alternate. But they could both, ideally, run at every time t, so we’re wasting time. This degrades performance and means that memory controller is idle

If we put memory-bound threads together, we wind up with idle cycles (run one thread, then the other, then we wait for the first to finish its memory operation).

46
Q

How to identify CPU-bound vs. Memory-bound threads?

A

We use hardware counters that track cache misses, instructions per cycle (IPC), or power/energy usage, a few of them combined. This gives us a picture of the thread’s resource needs.

For example, repeated last-level cache misses might indicate that a thread’s footprint doesn’t fit in cache (memory-bound)

47
Q

Is cycles per instruction (CPI) helpful in identifying CPU-bound vs memory-bound threads, and therefore for hyperthreading scheduling?

A

Assuming a wide spread of distinct CPI values for different processes, yes. Mixing varying CPI values increases the instructions per cycle our CPU cores are able to execute.

However! As it turns out, real threads don’t have these distinct, widely spread-out values. So no, it doesn’t work.