midterm Flashcards

Question

If you need to improve a performance metric like throughput or response time, what could you do in a pipelined model?

Answer 1

more information is gathered about how long each subtask will take, or at the very least how much time each subtask takes relative to each other. For example if subtask_1 takes 10 ms and subtask_2 takes 30 ms then it makes sense to have 3 times as many subtask_2 threads than subtask_1 threads.

Answer 2

overhead involved with synchronizing threads as well as balancing (accurately determining the number of threads that should be performing each subtask).

Answer 3

Mutexes is a mechanism employed in multithreading program to enable mutual exclusion within the execution of concurrent threads. mutexes protect shared information from being updated simultaneously from different areas at the same time. Mutexes are used like locks to ensure access to share information happens exclusively.

Answer 4

supplemental mechanisms to mutexes that allow a more fine grained control over the behaviour of multiple executing threads. keeps track of which threads are waiting on a certain set of criteria to be true. used in a multithreading library API via wait() commands. Threads can then signal or broadcast to condition variables to wake other threads that are waiting on the same condition(s).

Answer 5

Spurious wake-ups occur when a thread completes a task and then before unlocking a mutex begins to signal/broadcast to other threads that will need to acquire that same mutex lock to continue processing. Since the initial thread still has the lock the other threads won’t be able to make any progress. avoid this problem by letting the mutex be unlocked in the initial thread and then making signal/broadcast calls. This isn’t always possible however if you need a priority in which threads should be notified first as you will probably need to access (read) shared variables to determine which other threads should be notified via the conditional variable(s).

Answer 6

while() loops in the critical_section_enter of a thread task are required because even if a thread can acquire the mutex lock it doesn’t mean the proxy variable (shared) between threads didn’t change just before the lock was acquired. Another thread could have already acquired the lock and was signalled to continue, if it was first the while loop ensures mutual exclusion.

Answer 7

A simple way (if the application intention will allow it) is to use only one lock/mutex. This works because as long as the mutex is used correctly with lock/unlock it will have exclusivity in any given executing thread. This is problematic in that it’s quite restrictive and limits parallelism. A more difficult approach would be to maintain a lock order. Example: you must lock mutex A before you can lock mutex B.

Answer 8

Kernel and user-level threads are essentially abstractions (via data structures). In a general many-to-many scenario kernel threads are assigned light-weight processes which are associated with one or more user-level threads. When a user-level thread is created it returns a thread ID which is not a pointer but instead an index in a table that points to the user-level thread data structure. The thread size is known at compile time which means it can be allocated in contiguous memory which helps improve locality and memory access. The OS/kernel does not know about the existence or properties of each user-level thread. The kernel-level data structures have several data structures: Process which tracks a list of the kernel threads, signal handlers, virtual address space, and user credentials. Light-Weight Process (LWP) is close to a user-level thread but is visible to the kernel. It’s not needed when the specific process it is associated with is not running. keeps track of information regarding one or more more user-level threads in the process it’s associated with at any given time. Kernel-level Thread CPU Special signals and system calls allow the user and kernel level threads to coordinate. Even in many-to-many scenarios user-level threads can be bound to kernel-level threads (called a “bound” thread). This is similar to the idea that a kernel may “pin” a kernel-thread to a specific CPU. In Linux user-level threads are always bound to kernel level threads. This makes handling signals and scheduling much simpler since each thread is bound to each other."

Answer 9

"Linux uses 1-to-1 user to kernel threading so the extra functionality/mechanisms provided by LWP aren’t necessary. Signaling is more straightforward as well in this design in that you don’t need the additional flag described by the Solaris paper because the kernel thread that is going to send a signal to the process thread is matches 1-to-1 and might have the ability to see it’s in a critical section meaning it will wait to avoid potential deadlocks? At the time of the Solaris paper memory was very constrained, and only a few kernel threads could be stored in physical memory at a time “Nowadays it’s possible to store a large number of kernel-level thread data structures in physical memory without running out of space. That’s why most modern operating systems opt for a 1:1 threading model. The advantages (the kernel knows about all user-level threads and can make smarter scheduling decisions) outweigh the disadvantages (more memory usage).

Answer 10

an event from some external device (external to the CPU that retrieves the interrupt) notifying the CPU that something has occurred. could be a timer notifying that a timeout has occurred, an I/O device announcing the arrival of a network packet. Interrupts happen asynchronously When a device interrupts the CPU it sends a unique message to the CPU. The hardware defined message is looked up in a interrupt handler table and the handler is called. The PC is moved to the address of the handler code and executed from that point. Interrupts are asynchronous.

Answer 11

A signal is an event that comes from the software running on the OS or by the CPU hardware itself. Signals can appear both asynchronously and synchronously. Similarly to interrupts when a signal event happens the OS signals a process (instead of the device interrupting the CPU). The process then has signal specific handlers in the signal handler table based on the OS defined signals there can also be default OS handlers for signals if they aren’t specified by the individual process. Similarly to interrupts when a signal event happens the OS signals a process (instead of the device interrupting the CPU). The process then has signal specific handlers in the signal handler table based on the OS defined signals there can also be default OS handlers for signals if they aren’t specified by the individual process."

Answer 12

Each process can handle their own signal as the signal handler table is process specific. Threads cannot set their own signal handler, only their signal mask. A KLT can call a wrapper routine that has access to all threads running in the process. It can find a ULT that can handle the signal handler even if the KLT signal it originated from is running a ULT that doesn’t handle that signal.

Answer 13

it will cause a deadlock. threads library sets/clears a special flag in the threads structure whenever it enters/exists a critical section. This flag indicates all signals should be masked while the thread is in a critical section. (Implementing Lightweight Threads, D. Stein, D. Shah, “Signal Safe Critical Sections” on Page 8). Alternatively, the OS can detect that the handler contains a lock request and run it in a separate thread to prevent the risk of deadlock."

Answer 14

"A MP implementation of a simple webserver has one main benefit and that is it’s simplicity. downsides : requires a lot of context switching (as certain parts of the process require longer tasks like I/O), each process has it’s own address space which can be good because synchronization is less important, but it requires a larger memory footprint the more requests you want to be able to handle. T he separate address spaces also prevent the processes from sharing a cache. A MT implementation requires that the operating system supports kernel threads. Pros: Shared address space for a smaller memory footprint and cheaper context switches. Cons: This implementation is not as simple though and requires more complicated programming. It also requires synchronization with HTTP requests coming in to the server."

Answer 15

An event-based model is essentially a state machine. It’s a looping Event Dispatcher which based on incoming notifications/events calls specific event handlers to handle each process of an HTTP request. The dispatcher allows handlers to execute to completion, if handlers need to block the dispatcher will initiate the blocking operating and continue looping for additional events from HTTP requests. The largest benefit of an event-based model: no context-switches and the entire process happens in the same address space with no synchronization required. no risk of context switches being needed to hide latency (t_idle > 2 * t_ctx_switch). In the case when there is more than one CPU the event-based model still works because multiple event-based processes can be running on each CPU each handling multiple requests. ``` limitations: cache pollution (each event handler may require certain things to be loaded but not others)/loss of localities. blocking I/O events can cause the entire model to block." ```

Answer 16

An AMPED model is an improvement on the SPED model. makes use of helpers that directly deal with the blocking I/O operations which allows the main dispatcher to continue to pick up requests and serve them to handlers. helpers communicate directly with the dispatcher via pipes. helpers are processes themselves which handle the blocking operations.

Answer 17

"The AMTED model is similar to the AMPED model with the added benefit that the helpers can be threads instead of processes. This can be useful if you have a kernel that supports threading as well; however, it is necessarily more complicated to program than an AMPED implementation for the same reason threads are more complicated than processes. AMTED model would probably be more efficient than an AMPED model because switching between threads is less costly than switching between processes.

Answer 18

"Many requests for single file If all the requests are for a single file you won't see a lot of improvement from multithreading. The file will be stored in memory, so each thread will have almost zero idle time, hence there's no benefit to preempting one thread to schedule another. The only reason you would multithread in this case would be if you had multiple processors to work with, and then you would only need as many threads as there are processors. Remember that stopping one thread's execution to run another only makes sense if the first thread has to wait - i.e. if t_idle > 2 * t_context_switch Many random requests across a very pool of very large files “This is where you would see dramatic benefits from multithreading, compared to a single-threaded implementation. If the cache is cold, threads will sit idle while waiting to retrieve a file from disk. During this time another thread can execute, effectively ""hiding"" the latency of the file I/O.

Answer 19

Via fork If we want to create a process that is an exact replica of the calling process Via fork followed by exec If we want to create a process that is not an exact replica of the calling process Via fork or fork followed by exec Either of the above two options

Answer 20

hide the latency associated with code that blocks processing (such as a disk I/O request).

Answer 21

``` int curr_reader int curr_canceller int curr_writer int waiting_canceller int waiting_reader int waiting_writer condition cancel_cond condition read_cond condition write_cond mutex c_lock ``` ``` lock(c_lock) waiting_reader += 1 while (waiting_canceller > 0 || curr_canceller > 0 || curr_writer > 0) { wait(c_lock, read_cond) } waiting_reader -= 1 curr_reader += 1 unlock(c_lock) ``` read(calendar) ``` lock(c_lock) curr_reader -=1 if (waiting_canceller > 0) { signal(cancel_cond) } else if (waiting_reader > 0) { broadcast(read_cond) } else if (waiting_writer > 0) { signal(write_cond) } unlock(c_lock) ```

Answer 22

all signals are intercepted by a user-level threading library handler, and the user-level threading library installs a handler. This handler determines which user-level thread, if any, the signal be delivered to, and then it takes the appropriate steps to deliver the signal. If all user-level threads have the signal mask disabled and the kernel-level signal mask is updated, then and the signal remains pending to the process.

Answer 23

list of kernal level threads virtual address space user credentials signal handlers

Answer 24

``` User level registers system call args resource usage info signal mask similar to ULT but visible to kernal , not needed when process not running ```

Answer 25

``` kernal level registers stack pointer scheduling info pointers to associated LWP process information needed even when process not running, not swappable ```

Answer 26

current thread list of kernal -level threads dispatching and interrupt handling information on SPARC dedicated register == current thread

Answer 27

"Threads should be allocated as follows: Stage 1 should have 1 thread This 1 thread will parse a new request every 10ms Stage 2 should have 3 threads The requests parsed by Stage 1 get passed to the threads in Stage 2. Each thread picks up a request and needs 30ms to process the image. Hence, we need 3 threads in order to pick up a new request as soon as Stage 1 passes it. Stage 3 should have 2 threads. This is because Stage 2 will process an image and pass a request every 10ms (once the pipeline is filled). In this way, each Stage 3 thread will pick up a request and send an image in 20ms. Once the pipeline is filled, Stage 3 will be able to pick up a request and send the image every 10ms. The first request will take 60ms. The last stage will continue delivering the remaining 99 requests at 10ms intervals. So, the total is 60 + (99 * 10ms) = 1050ms = 1.05s 100 req / 1.05 sec = 95.2 req/s Relevant Sections P2L2: Threads and Concurrency Pipeline Pattern Multithreading Patterns Quiz"

midterm Flashcards

(52 cards)