Midterm Flashcards by Ruben Alexander

Describe boss-worker multi-threading.

boss worker multithreading
boss worker multithreading

you have one big boss
that calls n worker threads.
the boss sends out the tasks
the workers eat ‘nuff said

shared request queue/buffer.
That how we do
the boss adds to the queue
the workers use

How well did you know this?

Not at all

Perfectly

How do you improve B/W MT throughput or response time?

improve thoughput
or response time

i said thoughput
or response time

Increase thread count
Increase size of pool
Make the boss do less
so they ain’t no fool

How well did you know this?

Not at all

Perfectly

What are limiting factors in improving PLMT?

Bottle necked by the stage that takes the longest to complete

Difficult to keep the pipeline balanced over time.

How well did you know this?

Not at all

Perfectly

Describe the pipelined multithreading pattern.

PLMTP so many ages

If you need to improve a performance metric like throughput or response time, what could you do in a pipelined model?

PLMTP

work into stages
Get info from the pages
allo threads to each stage
pass it down like sage

How well did you know this?

Not at all

Perfectly

What are the limiting factors in improving pipelined MT performance with this pattern?

limiting factors
PLMT T

Bottle necked by the stage
takes the longest to complete.
Difficult to keep
the pipeline balanced and unique.

How well did you know this?

Not at all

Perfectly

What are the key roles of an operating system?

Key Roles
O S

Hide Hardware Complexity

M-G-M-T

Isolation Protection

For you and me

How well did you know this?

Not at all

Perfectly

Can you make distinction between OS abstractions, mechanisms, policies?

Abstractions simplify reading and manipulating physical storage into the file,

Example of abstractions include:

process/thread (application abstractions)
file, socket, memory page (hardware abstractions)
Policies are the rules around how a system is maintained and resources are utilized. For example, a common technique for swapping memory to disk is the least recently used (LRU) policy. This policy states that the memory that was least recently accessed will be swapped to disk. Policies help dictate the rules a system follows to manage and maintain a workable state. Note that policies can be different in different contexts or a different times in a system.

Mechanisms the verbs / tools of abstraction: COWS, create open write swap memory

Policies manage memory between hardware and software.

How well did you know this?

Not at all

Perfectly

What does the principle of separation of mechanism and policy mean?

mechanism policy separation
every rhyme saves the nation

how you enforce policy
ain’t stuck to policy plates

the policy is only valid in some contexts states

having a mechanism that only suits a policy is brittle.

make the mechanism a variety of policy dittles

as any one of these policies may be in effect in time.

optimize our mechanisms like a dope rhyme

a little bit in one direction maintain flexibility.

separate mechanism policy 123

How well did you know this?

Not at all

Perfectly

What does the principle optimize for the common case mean?

Optimizing for the common case means ensuring that the most frequent path of execution operates as performantly as possible. This is valuable for two reasons:

It’s simpler than trying to optimize across all possible cases
it leads to the largest performance gains as you are optimizing the flow that by definition is executed the most often
A great example of this is discussed in the SunOS paper, when talking about using threads for signal handling instead of changing masks before entering/after exiting a mutex:

The additional overhead in taking an interrupt
is about 40 SPARC instructions. The savings in the mutex enter/exit path is about 12 instructions. However, mutex operations are much more frequent than interrupts, so there is a net gain in time cost, as long as interrupts don’t block too frequently. The work to convert an interrupt into a “real” thread is performed only when there is lock contention

How well did you know this?

Not at all

Perfectly

What happens during a user-kernel mode crossing?

user kernel mode cross
flex and floss

App needs access to HW
read write disk
list sock allo memory
syscall this

How well did you know this?

Not at all

Perfectly

What are some of the reasons why user-kernel mode crossing happens?

user kernel cross
how you like that sauce?
user kernel cross app needs

access to hardware
read write disk disk
listen sock allo mem
syscall this

How well did you know this?

Not at all

Perfectly

What is a kernel trap? Why does it happen? What are the steps that take place during a kernel trap?

Kernel Trap
bap boom bp boom bap

Unpriviledged user performs
Priviledged action
Find source determine whats allowed and ask them

Return execution to the interrupted
User process ends
and that’s what’s up kid

How well did you know this?

Not at all

Perfectly

What is a system call? How does it happen? What are the steps that take place during a system call?

os bit 0 sys call completes

whats a syscall
whats a syscall
whats a syscall
whats a syscall

User lvl app asks 4 privileged action
UserProcess call
Pass control reaction
to os bit 0 sys call completes
return to user lvl app and is unique
exe context switch to user sink
async and sync that's how we drink

How well did you know this?

Not at all

Perfectly

Contrast the design decisions and performance tradeoffs among monolithic, modular and microkernel-based OS designs.

Monolithic
Pros
Everything included
Inlining, compile-time optimizations
Cons
No customization
Not too portable/manageable
Large memory footprint (which can impact performance)
Modular

Pros
Maintainability
Smaller footprint
Less resource needs

Cons
All the modularity/indirection can reduce some opportunities for optimization (but eh, not really)
Maintenance can still be an issue as modules from different codebases can be slung together at runtime

Microkernel
Pros
Size
Verifiability (great for embedded devices)
Cons
Bad portability (often customized to underlying hardware)
Harder to find common OS components due to specialized use case
Expensive cost of frequent user/kernel crossing

How well did you know this?

Not at all

Perfectly

Process vs. thread, describe the distinctions. What happens on a process vs. thread context switch.

process vs threads
process vs threads
get it in your head
process vs threads

virtual 
addy map execution 
virtual 
context clap addy map
the code 
is init data 
the code
is any heap mode

The execution context
has the stack CPU
registers associated with the process’s exe messenger.

Diff processes
diff virtual address map
and diff execution contexts,
repped by the process control block bap

Diff threads exist within
same process,
share the virtuaddy map
of process diff execution context

As a result, a multiprocess application needs a large memory footprint than a multithreaded app.

Greater memory needs mean that data will need to be swapped to disk more often, multithreaded applications will be more performant than multiprocess applications.

In addition, process-process communication via IPC is more resource intensive that thread-thread communication, which often just consists of reading/writing shared variables.

Since threads share more data than processes, less data needs to be swapped during a context switch. Because of this, thread context switching can be performed more quickly than process context switching. Process context switching involves the indirect cost of going from a hot cache to a cold cache. When a process is swapped out, most of the information the new process needs is in main memory and needs to be brought into the hardware cache. Since threads share more information - have more locality with one another - a new thread may still be able to benefit from the cache that was used by an older thread

How well did you know this?

Not at all

Perfectly

Describe the states in a lifetime of a process?

lifetime
of the process
time of a lover

new new
ready ready
running running 
waited waited
terminated

new state. At this point, the operating system initializes the PCB for the process,

ready state. In this state it is able to be executed, but it is not being executed. Once the process is scheduled and moved on to the CPU it is in

running state. If the process is then interrupted by the scheduler, it moves back the the ready state. If the process is running, and then makes an I/O request, it will move onto the wait queue for that I/O device and be in

the waited state. After the request is serviced, the process will move back to the ready state. If a running process exits, it moves to

the terminated state

How well did you know this?

Not at all

Perfectly

Describe the lifetime of a thread?

Study These Flashcards

lifetime of a thread
lifetime of a lover

thread created
and its initialized true
executing on that CPU

…wait for scheduled mutex
…condition true
…signal received
…Did you get that boo?

threads back to parents
zombie not reclaimed
reaper gets zombie thread
so its maintained

A thread can be created (a new execution context can be created and initialized). To create a thread
specify what procedure to run
and what arguments to pass under procedure sun.

Once the thread has been created, it will be a separate entity from the thread that created it, and we will say at this point that the process overall is multithreaded.

During the lifetime of a thread, the thread may be executing on the CPU - similar to how processes may be executing on the CPU at any point in time - or it may be waiting.

Threads may wait for several different reasons. They may be waiting to be scheduled on the CPU (just like processes). They may be waiting to acquire some synchronization construct - like a mutex - or they may be waiting on some condition to become true or some signal to be received - like a condition variable. Different queues may be maintained depending on the context of the waiting.

Finally, at the end of their life, threads can be joined back in to their parents. Through this join, threads can convey the result of the work they have done back to their parents. Alternatively, detachable threads cannot be joined back into their parent, so this step does not apply to them.

Threads can also be zombies! A zombie thread is a thread that has completed its work, but whose data has not yet been reclaimed. A zombie thread is doing nothing but taking up space. Zombies live on death row. Every once in a while, a special reaper thread comes and cleans up these zombie threads. If a new thread is requested before a zombie thread is reaped, the allocated data structures for the zombie will be reused for the new thread

STEPS to go from

waiting (blocked) state to a running (executing on the CPU) state.

Study These Flashcards

blocked waiting
ready not scheduled
preempted switching
ready to execute

When a process is in a blocked state, this means that process is currently waiting for an I/O request

process is sitting on an I/O queue within the kernel that is associated with the device it’s making the request to.

Once the request is fulfilled, the process moves back to a ready state, where it can be executed, although it is not yet scheduled.

The currently executing process must be preempted, and the ready process must be switched in. At this point, the process is executing on the CPU

What are the pros-and-cons of message-based vs. shared-memory-based IPC.

Study These Flashcards

MPIPC
\+ os manage
\+ syscall has ops
- overhead
- user kernel cross per msg

SMIPC
+ no kernel
- user managed sync

What are benefits of multithreading? When is it useful to add more threads, when does adding threads lead to pure overhead? What are the possible sources of overhead associated with multithreading?

Study These Flashcards

2 benefits to multithread, namely parallelization and concurrency.

Parallelization comes into effect on multi CPU systems in which separate threads can be executing at the same time on different processors. work allows for the work to be accomplished more quickly.

Concurrency refers to the idea that the CPU can do a little bit of work on one thread, and then switch and do a little bit of work on another thread. Succinctly, concurrency refers to the idea that the execution of tasks can be interwoven with one another.

The primary benefit of concurrency without parallelization - that is, concurrency on single CPU platforms - is the hiding of I/O latency. When a thread is waiting in an I/O queue, that thread is blocked: it’s execution cannot proceed. In a multithreaded environment, a thread scheduler can detect a blocked thread, and switch in a thread that can accomplish some work. This way, it looks as if progress is always being made (CPU utilization is high) even though one or more threads may be waiting idle for I/O to complete at any given time.

Adding threads make sense if you have fewer threads than you have cores or if you are performing many I/O requests. Adding threads can lead to pure overhead otherwise.

While new threads require less memory than new processes, they still do require memory. This means that threads cannot be made for free, and in fact thread creation is often quite expensive. In addition, thread synchronization can add more overhead to the system, in terms of the CPU cycles and memory that must be consumed to represent and operate synchronously

Boss Worker Multithreading pattern

Study These Flashcards

Boss worker multithreading
Boss worker multithreading

You have one big boss
that calls n worker threads
the boss sends the tasks the workers eat ‘nuff said

A shared request queue/buffer
That’s how we do
The boss adds to the queue
The workers use

Describe the pipelined multithreading pattern. If you need to improve a performance metric like throughput or response time, what could you do in a pipelined model? What are the limiting factors in improving performance with this pattern?

Study These Flashcards

pipelined multithreading pattern

BREAK DOWN

work into stages
get info from the pages
pass it down like sages
allo threads to each stage, yes

last line is how you would improve performance.

What are mutexes? What are condition variables? Can you quickly write the steps/code for entering/existing a critical section for problems such as reader/writer, reader/writer with selective priority (e.g., reader priority vs. writer priority)? What are spurious wake-ups, how do you avoid them, and can you always avoid them? Do you understand the need for using a while() look for the predicate check in the critical section entry code examples in the lessons?

Study These Flashcards

mumutextex
mumutextex

Mutexes are synch constructs

that enforce the principle
separate you from us

no more than one thread
comes inside
the critical section
at the same time

What’s a simple way to prevent deadlocks? Why?

Study These Flashcards

No deadlocks 
No deadlocks
No deadlocks
Lock and order
Lock and order

Lock and order
Prevent the deadlocks
on your block
This will prevent a cycle
in the wait graph graph

Lock and order
In case two threads 
try to mutex at 
deadlocks also happe
on a mutex that it holds

Lock and order
Lock and order ya’ll
Lock and order ya’ll

Can you explain the relationship among kernel vs. user-level threads? Think though a general mxn scenario (as described in the Solaris papers), and in the current Linux model. What happens during scheduling, synchronization and signaling in these cases?

User level threads use pthreads library while kernel level threads kernel threading implementation (like NPTL). For a user thread to run, it must be associated with a kernel thread, which in turn must be scheduled on the CPU. there is one kernel thread for every user thread. That means that when a user thread is created, a kernel thread is also created. This 1:1 model is the current situation in Linux and is supported at the kernel level by the task struct. The benefit of the approach is that the kernel understands that the process is multithreaded, and it also understands what those threads need. Since the operating system already supports threading mechanisms to manage its thread, the user libraries can benefit directly from the multithreading support available in the kernel. One downside of this approach is that is it expensive: for every operation we must go to the kernel and pay the cost of a system call. Another downside is that since we are relying on the mechanisms and policies supported by the kernel, we are limited to only those policies and mechanisms. As well, execution of our applications on different operating systems may give different results. In a so-called many-to-many scenario, there can be one or more user threads scheduled on one or more kernel threads. The kernel is aware that the process is multithreaded since it has assigned multiple kernel level threads to the process. This means that if one kernel level thread blocks on an operation, we can context switch to another, and the process as a whole can proceed. One of the downsides of this model is that is requires extra coordination between the user- and kernel-level thread managers. Signaling in a many to many scenario comes with complexities. If the kernel thread has a signal enabled but the user thread does not, the user threading library may have to send directed signals back down into the kernel to get the right user level thread to respond to the signal

Can you explain why some of the mechanisms described in the Solaris papers (for configuring the degree concurrency, for signaling, the use of LWP…) are not used or necessary in the current threads model in Linux? SOLARIS PAPER TO GO

Solaris paper to go Solaris paper to go ya'll Solaris paper to go Solaris paper to go ya'll ``` Even if you got concurrency doesn't mean you can get just what you need kernel threads need to be shared from you to me ``` The native threads in Linux is the Native POSIX Threads Library (NPTL). This is a 1:1 model, meaning that there is a kernel level task for each user level thread. In NPTL, the kernel sees every user level thread. This is acceptable because kernel trapping has become much cheaper, so user/kernel crossings are much more affordable. Also, modern platforms have more memory - removing the constraints to keep the number of kernel threads as small as possible

What's an interrupt? What's a signal? What happens during interrupt or signal handling? How does the OS know what to execute in response to a interrupt or signal?

what is an interrupt? what is a signal? Tell me what happens when they mingle? Interrupts are signals HW to CPU That signal something from me right to you Signals are delivered from the kernel to the process signals are OS specific ok boo? For example, when a user-level application tries to perform a illegal task using the hardware, the kernel is notified via an interrupt. An interrupt is handled on a per-CPU basis, and the operating system maintains an interrupt table, which maps interrupts by number to handling procedures. When the interrupt occurs, the kernel jumps to the associated interrupt handler and executes that code. Which interrupts occur is a function of the platform on which you are running. How those interrupts are handled is a function of the OS on top of the physical system

Can each process configure their own signal handler?

process? what? can you handle your handler? Each process maintains its signal handling table Its just like kernel-level interrupt table Each entry contains a reference to the signal And a reference to the handling code mingle When a signal comes in, the process jumps to handling code jingle

Can each thread have their own signal handler?

Each process maintains its own signal handling table, which is very similar to kernel-level interrupt handling table. Each entry contains a reference to the signal and a reference to the handling code. When a signal comes in, the process jumps to the handling code. Threads cannot have their own handler, although they can set their own signal masks to ensure that they can disable signals they don't want to receive

What's the potential issue if a interrupt or signal handler needs to lock a mutex?

Since handlers are executed within a thread's stack, there is the potential for a thread to deadlock with itself if it tries to lock a mutex it has already acquired. This is because the current stack frame needs the mutex that was acquired in a lower stack frame to be released.

What's the workaround described in the Solaris papers?

Put the signal handlers in another thread This way they can contend for mutex like any other thread This way, the signal handling code can contend for a mutex like any other thread, which removes the possibility of deadlock. Another solution is to have threads alter their signal masks before entering and after exiting their critical sections. While this solution requires fewer SPARC instructions than creating a new thread to handle signals, mutex acquisition happens much more frequently than signals. This is another example of optimizing for the common case

Contrast the pros-and-cons of a multithreaded (MT) and multiprocess (MP) implementation of a webserver, as described in the Flash paper

``` MP + simplicity + no synch - performance - large mem footprint ``` ``` MT + efficient + mem efficient + performance - complexity - kernel mt support ```

Contrast the pros-and-cons of a implementation of a webserver, as described in the Flash paper.

``` Multiprocess + simplicity + no sync code - performance - mem footprint woah - context switch, cpu cycle ``` ``` Multithreaded + efficient + mem efficient - sync code - complex - kernel MT support ```

What are the benefits of the event-based model described in the Flash paper over MT and MP? What are the limitations? Would you convert the AMPED model into a AMTED (async multi-threaded event-driven)? How do you think an AMTED version of Flash would compare to the AMPED version of Flash?

``` EBM + one thread + mem footprint + no context switch - lack of kernel support ``` AMTED > AMPED + less mem + big cache

FLASH TEST Experiment: Single File Test

Single File Test + best case + single file with multiple sizes + optimal

FLASH TEST Experiment: Owlnet Trace

owlnet | sped + amped > mtmp

FLASH TEST Experiment: CS Trace

cs trace + amped , mtmp smoke sped - disk bound

FLASH TEST Experiment: Optimizations

``` optimize + connection rate + pathname cache + response header cache + mapped file cache ```

FLASH TEST Experiment: Performance Under WAN

MT/AMPED/STED all caused + stable performance improvements + more clients. - mp performance

If you ran your server from the class project for many requests for a single file?

add thread to projects threads to your head more threads means more concurrent request getfile does better in city GIOS

If you ran your server from the class project for two different traces many random requests across a very large pool of very large files, what do you think would happen as you add more threads to your server? Can you sketch a hypothetical graph?

no impact from more threads more threads more disk reads I would say that the graph for throughput vs threads in both cases would be logarithmically increasing, with the graph for the single file example rising much more sharply than the graph for the large pool

Midterm Flashcards

(41 cards)