Midterm Flashcards

1
Q

What are the key roles of an operating system?

A

1) Hide Hardware complexity (eg Abstraction)
Ex: OS hides the various storage devices in the system (hard drives, USB drives, etc) and provides a higher level abstractions known as a file.

2) Manage underlying hardware resources (eg Arbitration)
Ex: OS decides how many and which resources an application is allowed to use: OS allocates a certain amount of memory to an application, schedule it to run on the CPU, and controls access to various I/O devices.

3) Provide Isolation and Protection
With multiple applications running concurrently, the OS must make sure that no one application interferes with the execution of another application.
Ex: OS makes sure that memory allocated to one app is accessed by another app

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Can you make distinction between OS abstractions, mechanisms, policies?

A

Abstractions - Entities that represent other entities in a more simplified approach. Typically enables portability.

  • Ex: Files are an abstraction of physical hardware. We interact with the File and have the OS handle the complexities of reading and manipulating the physical storage.
  • Additional Examples: sockets (network communications, memory page (memory management)

Policies - Rules on how a system is maintained and resources are utilized.
-Ex: Using least recently used (LRU) policy to manage memory swapping to disk. Memory that was least recently accessed will be swapped to disk.

Mechanisms - Tools used to implement Policies.
-Ex: Using a queue data structure that tracks the least recently used memory blocks as a means of implementing the LRU policy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the principle of separation of mechanism and policy mean?

A

Goal of this design principle is to use several flexible mechanisms that can support a range of policies. In different settings, applying different policies make sense.

Ex: For memory management some useful policies we could want include ‘least recently used’, ‘least frequently used’, or completely random. To support this, the OS will want to have mechanisms to track the frequency and time when the memory location was last accessed. This allows the OS to support all three policies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the principle optimize for the common case mean?

A

Goal of this principle is to make sure that the most common use case for execution performs the best. Once the common use case is identified, pick a specific policy that makes sense and that can be supported given the underlying mechanisms and abstractions of the OS in use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What happens during a user-kernel mode crossing?

A

Initially, a bit in the CPU is either set (privileged) or not set (unprivileged) to control access to the hardware. The hardware causes a trap if an application from unprivileged user mode tries to perform some instruction or access at the privileged kernel mode. The OS then determines what caused the trap and if it should grant access or terminate the action.

User-kernel mode crossing is also called a context switch: It takes CPU cycles to perform because it requires a lot more instructions and will most likely cause a switch in data locality, moving data from fast cache to slower memory. If the desired data is in cache, it is considered to be hot. If the data is in memory, the cache is considered to be cold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some of the reasons why user-kernel mode crossing happens?

A

Any time an application needs access to underlying hardware.

For example: read from a file, write from a file

Occurs in response to a kernel trap or via a system call.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a kernel trap? Why does it happen? What are the steps that take place during a kernel trap?

A

Occurs when an unprivileged app attempts to perform a privileged operation. The hardware sends a signal to the OS which takes control at a specific location. The OS then has a chance to check what cause the trap and verify if it should grant access or if it should terminate the process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a system call? How does it happen? What are the steps that take place during a system call?

A

So that apps can leverage the OS’s privileged access to hardware and avoid traps, the OS exposes a system call interface for the app to use if they want the OS to perform certain operations on their behalf.
-Ex: open() call to perform access to a file, send() to perform access to a socket

During a system call:

  • The app makes the system call
  • OS does a context switch from the app process to the kernel, making sure it holds onto the arguments passed within that system call
  • The trap bit in the CPU gets adjusted so that the operation isn’t in privileged kernel mode
  • The kernel executes the system call
  • Once completed, the trap bit in the CPU is adjusted back
  • OS does a context switch back to the app process and passes back the results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Contrast the design decisions and performance tradeoffs among monolithic, modular and microkernel-based OS designs.

A

Monolithic OS: Includes all services required by the app and the hardware. Ex: File system for sequential I/O and a file system for random I/O.
Pros:
-Everything is included
-Compile-time optimizations: Abstraction and all the services are packaged at the same time.
Cons:
-Lacks customization, portability, and manageability: Too much state, too much code that is hard to maintain, debug, and upgrade
-Requires a large memory footprint which impacts performance

Moduler OS: Includes basic services, APIs, and everything can be added as a module via an interface. Most common OS design today (eg linux).
Pros:
-Easier to maintain and upgrade
-Smaller code base -> only include modules that are needed for the use case
-Less resource intensive/better performance -> more memory for the app
Cons: 
-Relying on an interface creates indirection which can impact performance but it’s not typically very significant.
-Maintenance can be an issue -> Modules come from different code bases and could be a source of bugs.

Microkernel-based OS: Small, requiring only the basic OS primitives and generally customized (eg embedded devices). All other typical OS components run outside of the microkernel. Relies heavily on interprocess interactions.
Pros:
-Very small -> lower overhead and better performance
-Easier to verify -> test that the code behaves exactly as it should
Cons:
-Portability -> typically very customized to underlying hardware
-Software complexity -> many one-off versions for different hardware makes it hard to find common software components
-Frequent inter-process interactions is costly -> need for frequent user/kernel crossings is expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Process vs. thread, describe the distinctions. What happens on a process vs. thread context switch

A

Process: Instance of an executing program. An app that has been loaded in memory and is executing.
-A single threaded process is represented by it’s address space.
The address space will contain all the physical to virtual memory mapping for it’s code, data, and files.
-The process is also represented by it’s execution context (eg register values, stack pointer, program pointer).
-The OS represents all this information in a Process Control Block (PCB)

Threads:
-Part of the same virtual address space (and share all the code, data, files).
-Will potentially execute different instructions, access different portions of the address space, operate on different portions of the input, all within the same process.
-Each thread will need a different execution context (eg program counter, stack pointer, stack, and thread specific registers).
-The OS representation of a multi-threaded process will be a more complex process control block (PCB) structure.
will contain all the info shared among the threads (eg virtual address mappings, code, data, etc)
separate info about each thread’s execution context.

Process vs Thread context switch:
-Because threads share an address space, you don’t incur the indirect cost of recreating physical to virtual address space mappings. As a result, it’s not as costly to context switch between threads vs context switching between Processes which will likely require pulling in data from memory and having a cold cache.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the states in a lifetime of a process?

A

When a process is created, it enters the “new” state. In this state, the OS will determine if it’s ok to admit the process, and if it is, allocate/initialize a PCB and some initial resources.

Assuming there are minimum available resources, the process is admitted and is in the “ready” state since it’s ready to start executing. Still not executed by the CPU. Process is waiting until the scheduler moves it into a “running” state.

When the scheduler gives the CPU a “ready” process, it is now in a “running” state. Once in a “running” state, a few things can happen:

  • The process is interrupted and moved back into a “ready” state due to a need to execute a context switch.
  • The process is moved into a “waiting” state because it initiated a long operation and is waiting on an event to occur (eg input from a keyboard). Once this event occurs, the process moves into the “ready” state.
  • When a “running” process finishes all the operations of a program, or when it encounters an error, it will exit. It will return an exit code (success or failure) and the process will terminate.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the lifetime of a thread?

A

A parent thread can create a child thread using a “fork” call with two paramaters, a procedure argument that reflects the procedure the new thread will execute and then the arguments for the procedure. When the new childe thread is created, a new thread data structure is created, it’s program counter will point to the first instruction on the procedure and the arguments will be provided on the thread stack.

If the parent thread terminates before the child thread, the child thread will terminate early. To prevent this, the parent thread would make a “join” call with the thread ID of the child thread and will be blocked until the child thread completes. Once the child thread completes, any resources that were allocated for its execution will be freed, the result of the computation is returned to the parent thread, and the child thread terminates.

If a child thread isn’t joined back with the parent thread and the resources of that child thread haven’t been freed, it’s called a zombie thread.

To avoid problems associated with multiple threads accessing shared data, mutual exclusion is applied so that when a thread is accessing shared data, the other threads are “blocked” until the thread has completed it’s operation. For more complicated use cases, conditional variables are used as a mechanism to support thread blocking until a specific event occurs. Once the event occurs, threads are awaken to execute their operation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe all the steps which take place for a process to transition form a waiting (blocked) state to a running (executing on the CPU) state.

A

Typically, a process is in a waiting (blocked) state because it’s executing a long operation such as executing an I/O operation. The process is in an I/O waiting queue until the hardware completes the I/O operation and then responds. The scheduled then moves the waiting process back into a running state or it might be actually scheduled on the CPU depending on the current load on the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the pros-and-cons of message-based vs. shared-memory-based IPC.

A

Message-based IPC: OS establishes a communication channel (eg read/write or message to a shared buffer).

  • Pros: OS manages the communication channel -> same OS-level APIs/system calls (eg send/rec data)
  • Cons: Overhead -> Transferring info requires going from process 1 to channel to process 2.

Shared memory-based IPC: OS creates a shared memory channel and then maps it to the address space of each process. Both processes can ready/write from this memory and the OS is out of the way.

  • Pros: No overhead -> OS is out of the way
  • Cons: Need to re-implement code -> User manages all synchronization. Error prone. Don’t use OS APIs/calls.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are benefits of multithreading? When is it useful to add more threads, when does adding threads lead to pure overhead? What are the possible sources of overhead associated with multithreading?

A

Multi-Core Benefits: First area of benefit is with parallelization. Assuming that each core has it’s own thread running the exact same code but with different inputs from an input matrix, each thread would execute it’s work in the same amount of time resulting in a speed up. Speed up efficiencies can also be found if each thread executes different portions of a program. Another area of benefit is with specialization. By specializing a thread, it is executing the same type of work and using the same data in cache creating a hot cache. Such result would lead to performance gains.

Single CPU: For a single CPU, multithreading allows the CPU to switch to a new thread while one thread is idle. Context switching for threads aren’t as costly as processes because the address space is shared by the threads so the time to switch is shorter. Overall, this leads to a performance benefit.

Useful to add threads when there are fewer thread than core or if you your threads are executing operations that cause it to idle often. Otherwise, adding threads is an overhead, especially if they aren’t efficiently used.

Multithreading overhead can come from not freeing up resources once a thread completes (eg zombie threads) and when managing thread synchronization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe the boss-worker multithreading pattern. If you need to improve a performance metric like throughput or response time, what could you do in a boss-worker model? What are the limiting factors in improving performance with this pattern?

A

The boss-worker multithreading pattern involves a boss thread provides work to a number of worker threads, typically via a queue or buffer. The top limiting factor is that the boss thread gets involved on step 1 for each task despite how many worker threads are available.

To improve performance, you want to ensure the boss thread is efficient:

  • Instead of having the boss thread directly track and communicate with a specific thread, you would want to minimize this overhead by simply having the boss thread communicate the work to a queue that the workers would access.
  • Instead of having the boss wait to add more to the queue when it’s full, make sure there is a pool of worker threads ready and available to process the work.
  • Lastly, you could specialize the worker threads to gain data locality performance benefits. While this would require the boss thread to do a little more work to track the worker thread pools, it would be offset from the performance gains.
17
Q

Describe the pipelined multithreading pattern. If you need to improve a performance metric like throughput or response time, what could you do in a pipelined model? What are the limiting factors in improving performance with this pattern?

A

The pipeline multithreading pattern is where a task is executed in a sequence of stages where each stage represents a subtask and each stage includes different threads. There can be multiple tasks concurrently in the system at different pipeline stages.

Throughput is dependent on the longest stages. To address longer stages, you would ensure you had a pool of threads prepared in advance (similar to the thread pool technique used with the boss-worker pattern) and that you would have more threads made available to the longer stages vs the shorter stages. The challenge with this approach is that any changes to the input rate or available resources could dictate a need to rebalance.

18
Q

What’s a simple way to prevent deadlocks? Why?

A

Two threads can deadlock if they are trying to acquire each other’s mutexes. To prevent this from happening, the simple solutions is to maintain lock order. For example. each thread will first focus on acquiring mutex A and once completed, then would move on to mutex B, etc. This prevents the situation where the threads own different mutexes and then try to acquire each other’s mutex.

19
Q

What are mutexes?

A

A mutex is a synchronization construct that enforces the principle of mutual exclusion. It acts like a lock when used to access a shared resource for multiple threads. When a thread locks a Mutex, it has exclusive access to the shared resource. When another thread attempts to access a locked resource, it is considered to be “blocked” and will be suspended unless the Mutex owner releases the lock.

20
Q

What are condition variables?

A

A condition variable allows a thread to block until some defined event happens. Is always associated with a particular mutex, and with the data protected by that mutex. When the event happens, the blocked threads are signaled and awoken so that they can acquire the mutex and access the shared resource.

21
Q

What are spurious wake-ups, how do you avoid them, and can you always avoid them?

A

Spuriuos wake ups are when we wake up threads from an existing queue and then move them to another queue. An example is when a conditional variable sends a broadcast() call to wake up threads, only to move them to a mutex queue because the mutex hasn’t been unlocked yet. Sometimes, spurious wake ups can be avoided by waking up threads (eg broadcast()) after releasing the mutex. However, if the mutex can’t be released before waking up the threads (eg still need access to the shared resource), then the spurious wake up can’t be avoided.

22
Q

Do you understand the need for using a while() look for the predicate check in the critical section entry code examples in the lessons?

A

The while loop helps address the situation where there are multiple threads who are attempting to access the shared resources and acquire the mutex. It helps to confirm that the condition is still true when a thread acquires the mutex.

23
Q

Can you quickly write the steps/code for entering/exiting a critical section for problems such as reader/writer (reader priority)?

A
//reader code
lock(&m);
waitingReads++;
while(status == -1)
  wait(&cond_reader, &m);
status++;
waitingReads--;
unlock(&m);

//reader does work

lock(&m);
status--;
if(status == 0)
  signal(&cond_writer, &m);
unlock(&m);
//writer code
lock(&m);
while(status != 0)
  wait(&cond_reader, &m);
status = -1;
unlock(&m);

//writer does work

lock(&m);
status = 0;
if(waitingReads > 0)
  broadcast(&cond_reader)
else
  signal(&cond_writer)
unlock(&m);
24
Q

What’s a simple way to prevent deadlocks? Why?

A

Two threads can deadlock if they are trying to acquire each other’s mutexes. To prevent this from happening, the simple solutions is to maintain lock order. For example. each thread will first focus on acquiring mutex A and once completed, then would move on to mutex B, etc. This prevents the situation where the threads own different mutexes and then try to acquire each other’s mutex.

25
Q

Can you explain the relationship among kernel vs. user-level threads? Think though a general mxn scenario (as described in the Solaris papers), and in the current Linux model. What happens during scheduling, synchronization and signaling in these cases?

A

User level threads are associated with a user level library (like pthreads), while kernel level threads are associated with the kernel threading implementation (like NPTL). For a user thread to run, it must be associated with a kernel thread, which in turn must be scheduled on the CPU.

In the one-to-one model, there is one kernel thread for every user thread. That means that when a user thread is created, a kernel thread is also created. This 1:1 model is the current situation in Linux and is supported at the kernel level by the task struct. The benefit of the approach is that the kernel understands that the process is multithreaded, and it also understands what those threads need. Since the operating system already supports threading mechanisms to manage its thread, the user libraries can benefit directly from the multithreading support available in the kernel.

One downside of this approach is that is it expensive: for every operation we must go to the kernel and pay the cost of a system call. Another downside is that since we are relying on the mechanisms and policies supported by the kernel, we are limited to only those policies and mechanisms. As well, execution of our applications on different operating systems may give different results.

In a so-called many-to-many scenario, there can be one or more user threads scheduled on one or more kernel threads. The kernel is aware that the process is multithreaded since it has assigned multiple kernel level threads to the process. This means that if one kernel level thread blocks on an operation, we can context switch to another, and the process as a whole can proceed. One of the downsides of this model is that is requires extra coordination between the user- and kernel-level thread managers.

Signaling in a many to many scenario comes with complexities. If the kernel thread has a signal enabled but the user thread does not, the user threading library may have to send directed signals back down into the kernel to get the right user level thread to respond to the signal.

26
Q

Can you explain why some of the mechanisms described in the Solaris papers (for configuring the degree concurrency, for signaling, the use of LWP…) are not used or necessary in the current threads model in Linux?

A

The mechanisms described in the Solaris paper were all mechanisms for rationing the number of threads that are created at any given point in time. Configuring the degree of concurrency implied that you couldn’t just be automatically given as much as you needed. Having LWPs multiplex over kernel threads indicated that kernel threads needed to be shared. The rationing was based on real concerns about the available memory in a system at any given time.

The native implementation of threads in Linux is the Native POSIX Threads Library (NPTL). This is a 1:1 model, meaning that there is a kernel level task for each user level thread. In NPTL, the kernel sees every user level thread. This is acceptable because kernel trapping has become much cheaper, so user/kernel crossings are much more affordable. Also, modern platforms have more memory - removing the constraints to keep the number of kernel threads as small as possible

27
Q

What’s an interrupt?

A

Interrupts are events generated externally by components other than the CPU (I/O devices, timers, other CPUs) but informs the CPU that something has occurred. For example, when a user-level application tries to perform an illegal task using the hardware, the kernel is notified via an interrupt.

Determined based on the hardware platform. Occurs asynchronously.

28
Q

What’s a signal?

A

Signals are events triggered by the CPU & Software running on the CPU. For example, if a process tries to access memory it has not allocated, the operating system may throw a SIGSEGV (segmentation fault). Signals are operating system specific.

Determined based on the OS. Occurs sync and async.

29
Q

What happens during interrupt or signal handling? How does the OS know what to execute in response to an interrupt or signal?Can each process configure their own signal handler? Can each thread have their own signal handler?

A

The interrupt/signal interrupts the execution of the thread that was executing on the CPU. Based on the interrupt/signal number a table is referenced (eg interrupt/signal handler table). The table specifies the starting address of the handler routine. The program counter of the thread is set to the start of that handler address and the routine begins. All of this handles within the context of the thread that was interrupted.

For Interrupts, the HW defines the interrupt #s and the OS specifies the handler routines.

For signals, the OS defines the signal #s and the Process defines the handler routines. Each thread can’t have their own signal handler.

30
Q

What’s the potential issue if an interrupt or signal handler needs to lock a mutex? What’s the workaround described in the Solaris papers?

A

Since handlers are executed within a thread’s stack, there is the potential for a thread to deadlock with itself if it tries to lock a mutex it has already acquired.

One solution is to use masks to disable the interrupts/signals and allow the thread to finish it’s execution and release the mutex.

The Solaris paper proposed having the interrupts/signals execute as a separate thread that would simply block until the mutex is made available by the original thread.

31
Q

Contrast the pros-and-cons of a multithreaded (MT) and multiprocess (MP) implementation of a webserver, as described in the Flash paper.

A

MP:

  • Pro: Simplicity. Doesn’t require sync code.
  • Con: Larger memory footprint. Context switching is expensive.

MT:

  • Pro: More efficient -> shared address space. Los cost context switching. Higher likelihood of hot cache.
  • Con: Complexity -> sync code. Not portable -> requires kernel MT support.
32
Q

What are the benefits of the event-based model described in the Flash paper over MT and MP? What are the limitations? Would you convert the AMPED model into a AMTED (async multi-threaded event-driven)? How do you think about AMTED version of Flash would compare to the AMPED version of Flash?

A

Benefits: Single address space, flow of control. Smaller memory requirements. No context switching. No sync.

Limitation: Not all kernels support async I/O. Requires the use of helpers to simulate the async I/O behavior.

Yes, AMTED model would require less memory than AMPED model.

The AMPED version of Flash included the use of helpers to simulate asyc I/O behavior which exists natively within AMTED model. Since the outcomes are the same, I imagine they would compare similarly.

33
Q

There are several sets of experimental results from the Flash paper discussed in the lesson. Do you understand the purpose of each set of experiments (what was the question they wanted to answer)? Do you understand why the experiment was structured in a particular why (why they chose the variables to be varied, the workload parameters, the measured metric…).

A

Single File Test
This test was a test of “best case”: to show what performance would look like across the servers when the request pattern was optimal. To achieve this, requests were made for a single file which varied in size. SPED performed better than AMPED - which performs the extra check to see if the file is in memory (not an issue here because its one file and is always going to be in memory). Both performed better than MT/MP, which were slowed by context switching overhead.

Owlnet Trace
This tests workloads that can primarily be served from cache, but not completely. AMPED performs slightly better than SPED here. Since some of the work is disk-bound SPED blocks where AMPED context switches. Both of these approaches are better than MT/MP models, which have more memory allocated to them (less for cache size), and still incur the overhead of context switching.

CS Trace
This tests workloads that become disk-bound very quickly. In this case, AMPED, and MT/MP smoke SPED. Since SPED is single-threaded, with not helpers, SPED blocks a lot when the workload is disk-bound. AMPED only has enough helper processes to keep the disk busy, so it has a smaller memory footprint and few context switching overheads than MP/MT

Optimizations
Looking at connection rate as a result of file size across the various combination of optimizations - pathname lookup caching, response header caching, and mapped file caching - shows that the combination containing all optimizations can handle the highest connection rate.

Performance Under WAN
MT/AMPED/STED all caused stable performance improvements when more clients were added. The per-process overhead of the MP model caused performance to degrade in this model.

34
Q

If you ran your server from the class project for two different traces: (i) many requests for a single file, and (ii) many random requests across a very large pool of very large files, what do you think would happen as you add more threads to your server? Can you sketch a hypothetical graph?

A

Single file
Adding more threads would likely increase the throughput of the GETFILE server. Files need to be read from disk into memory before they can be served. Once a file is read into memory, it can stay in memory until its contents need to be replaced. More threads would be able to serve more concurrent requests.

Random requests across a large pool of large files
Adding more threads may not have a huge impact on throughput in this scenario. Any given new request is not likely to be able to read from main memory, since the range of inputs is so wide. This means that this application would likely be disk-bound from the get-go. Increasing the number of threads allows for more concurrent requests to be processed, but each thread will likely have to pull content from disk.

I would say that the graph for throughput vs threads in both cases would be logarithmically increasing, with the graph for the single file example rising much more sharply than the graph for the large pool.