Part 2 Flashcards
Process vs. thread, describe the distinctions. What happens on a process vs. thread context switch.
process
- any user application that is running on the operating system.
- has it’s own memory space allocated to it. Both heap and stack allocations in virtual memory belong to said process. The process consists of its address space - this includes: the code (text), the data that’s available when the process is initialized, the heap, and the stack. As the process executes dynamic data is added to the heap. The stack is a LIFO (last in first out) data structure that is useful for process execution when an applications needs to jump to another location to execute something and later return to a prior position. During a process context switch all information about the process that is being preempted that is tracking by the CPU will be updated in the process control block for that process. The CPU will then have the information loaded for a different process, and then switched back when the context switch happens in reverse. It may also require data be removed from the processor cache to make room for the other process.
A thread is similar to a process except in the case of multiple instances it has it’s own program counter, stack pointer, stack and thread-specific registers. But it shares the same virtual address space with other threads.. When a context switch occurs, the running thread stores its execution context (its program counter, stack pointer, and register values) in memory, and the new thread’s execution context is loaded from memory onto the processor. This is faster than a context switch between processes, because threads don’t need the costly virtual to physical address mappings to be swapped out or recreated. This also leads to hotter caches during switches which is a performance optimization.
Describe the states in a lifetime of a process?
New - First state when a process is created
Ready - Once OS admits process it will get a PCB and some initial resources, a running process is interrupted (context switch)
Running - OS Scheduler gives CPU to a process
Waiting - I/O event (or some other long running event/operation), causes a Ready state after I/O or event completes
Terminated - Process finishes all operations or encounters error
lifetime of a thread
Threads are created when a parent process/thread calls a thread creation mechanism. The threads then run synchronously (unless blocked/waiting). They can be signalled or broadcasted to in order to check if they need to continue blocking or continue executing. The parent process/thread can call a join mechanism to block itself until a child completes after which it’s result can be returned to said parent.
Describe all the steps which take place for a process to transition from a waiting (blocked) state to a running (executing on the CPU) state.
A waiting process will wait until it’s current event or operating that caused the WAITING state to finish. It will then transition to a READY state. Once in the READY state the process can be scheduled by the scheduler to have a CPU in which case it will enter the RUNNING state.
What are the pros-and-cons of message-based vs. shared-memory-based IPC.
Message-based IPC uses a OS provided communication channel to allow processes to send messages to each other. This is good because the operating system maintains this communication channel for the processes so the API is more universally implemented. It can also be bad because it requires a lot of overhead. The processes have to copy information into the communication channel into kernel memory through the OS.
Shared-memory-based IPC is implemented when the OS maps a shared memory space for processes to share. Both/all processes can access this shared memory space as if it were their own. This gets the OS out of the way which is good for performance, but could be bad because the OS no longer manages that address space it’s up to the processes which can be bug prone and processes must know how to handle said shared memory space.
What are benefits of multithreading? When is it useful to add more threads, when does adding threads lead to pure overhead? What are the possible sources of overhead associated with multithreading?
Multithreading allows parallelization to occur which helps achieve overall performance/speed increases and/or execution time decreases. Multiple threads of an application can be at different points in execution handling more input at the same time (especially on multi-core systems). Threads can also be assigned long execution and block tasks so the main application or other threads can continue processing information/input while other threads wait for slower devices like I/O. Multithreading requires less memory because threads can share address space. This means the application requires less memory which could result in less memory swaps.
Depending on the input and tasks of an application it could be beneficial to add more threads. For example in a pipeline pattern it could make senses to match the number of threads to the number of steps in the pipeline or perhaps several threads per step (for the longer/more involved steps). In boss-worker patterns it might be determinantal to add more threads dynamically if there isn’t enough work for those threads to do.
Additional overhead in multithreading can occur when a boss thread has to manage a pool of worker threads. It may not know exactly what each thread is doing or what it did last so it is difficult to know during execution time which threads may be more/less efficient at certain tasks. The application must also deal with the overhead of keeping the shared buffer synchronized.
Describe the boss-worked multithreading pattern. If you need to improve a performance metric like throughput or response time, what could you do in a boss-worker model? What are the limiting factors in improving performance with this pattern?
The boss-worker pattern is a multithreading pattern where you have one boss/main thread which creates one or more children/worker threads to complete tasks. This pattern assumes each worker thread fully complete the task, in that, each worker will fully handle all portions/subtasks of a task. The boss thread will then call joins and wait for worker thread executions before finishing execution or moving on to another portion of the application logic.
One way to improve the boss-worker pattern is to have the boss simply add work to a queue which the worker threads can then work on. This gives the boss less one-on-one work with each thread and it can focus on creating a queue. The queue can be created fully before the worker threads are created, or the threads could be created and then the boss can start adding work to the queue for even more efficiency in response time/throughput.
A limiting factor of this pattern is the overhead associated with the synchronization requirements of shared buffer resources, in addition this pattern ignores locality (meaning the boss doesn’t try to positively affect hot cache performance opportunities).
Describe the pipelined multithreading pattern. If you need to improve a performance metric like throughput or response time, what could you do in a pipelined model? What are the limiting factors in improving performance with this pattern?
The pipeline pattern is a multithreading pattern where you have n number of threads that each handle one subtask in a series of tasks to accomplish the overall purpose of the application. The first thread(s) pass on the tasks to the second thread(s) to perform a separate task that relies on the first task being complete, so on, until the end of the overall process.
Pipeline patterns can be improved when more information is gathered about how long each subtask will take, or at the very least how much time each subtask takes relative to each other. For example if subtask_1 takes 10 ms and subtask_2 takes 30 ms then it makes sense to have 3 times as many subtask_2 threads than subtask_1 threads.
A limiting factor of this design pattern is the overhead involved with synchronizing threads as well as balancing (accurately determining the number of threads that should be performing each subtask).
What are mutexes?
Mutexes is a mechanism employed in multithreading program to enable mutual exclusion within the execution of concurrent threads. In other words mutexes protect shared information from being updated simultaneously from different areas at the same time. Mutexes are used like locks to ensure access to share information happens exclusively.
What are condition variables?
Condition Variables are supplemental mechanisms to mutexes that allow a more fine grained control over the behaviour of multiple executing threads. A condition variable keeps track of which threads are waiting on a certain set of criteria to be true. Condition variables are used in a multithreading library API via wait() commands. Threads can then signal or broadcast to condition variables to wake other threads that are waiting on the same condition(s).
Read/Write Critical Sections
Read/Write Critical Sections
Mutex m;
Condition read_phase, write_phase;
int num_reader = 0, num_write = 0;
//READER(S)
// lock the mutex - critical enter Lock(m) { // make sure there are no writers, wait if there are while(num_write > 0) { Wait(m, read_phase); }
num_reader++; }
// read data // during this time num_reader > 0
// Lock mutex - critical exit Lock(m) { num_reader--; }
// signal to the waiting writer signal(write_phase);
//WRITER
// Lock the mutex - critical enter lock(m) { // make sure there are no other readers or writer, wait while(num_reader > 0 || num_writer > 0) { Wait(m, write_phase); }
num_writer++; }
// write data // during this time num_writer = 1
// Lock mutex - critical exit lock(m) { num_writer--; }
// broadcast to other waiting readers
broadcast(read_phase);
// signal to a waiting writer
signal(write_phase);
Read/Write Critical Sections with Priority Read
Mutex m;
Condition read_phase, write_phase;
int num_reader = 0, num_write = 0;
//READER(S)
// lock the mutex - critical enter Lock(m) { // make sure there are no writers, wait if there are while(num_write > 0) { Wait(m, read_phase); }
num_reader++; }
// read data // during this time num_reader is > 0
// Lock mutex - critical exit Lock(m) { num_reader--;
// signal writer if no other readers if(num_reader == 0) { signal(write_phase); } else { broadcast(read_phase); } }
//WRITER
// Lock the mutex - critical enter lock(m) { // no other readers or writer, wait if there is while(num_reader > 0 || num_writer > 0) { Wait(m, write_phase); }
num_writer++; }
// write data // during this time num_writer = 1
// Lock mutex - critical exit lock(m) { num_writer--; }
// broadcast to other waiting readers
broadcast(read_phase);
// signal to a waiting writer
signal(write_phase)
Read/Write Critical Sections with Priority Write
similar to Priority Read you just have to ensure the broadcast/signal happens in the mutex lock since you need access to the num_write shared variable. If there is another waiting writer it would get priority over broadcasting to waiting readers.
Spurious wake-ups
occur when a thread completes a task and then before unlocking a mutex begins to signal/broadcast to other threads that will need to acquire that same mutex lock to continue processing. Since the initial thread still has the lock the other threads won’t be able to make any progress. You can avoid this problem by simply letting the mutex be unlocked in the initial thread and then making signal/broadcast calls. This isn’t always possible however if you need a priority in which threads should be notified first as you will probably need to access (read) shared variables to determine which other threads should be notified via the conditional variable(s).
while() loops in the critical_section_enter of a thread task are required because
because even if a thread can acquire the mutex lock it doesn’t mean the proxy variable (shared) between threads didn’t change just before the lock was acquired. Another thread could have already acquired the lock and was signalled to continue, if it was first the while loop ensures mutual exclusion.