11-16 Flashcards
Thread: A new abstraction for __.
a single running process
Multi-threaded program:
(3)…
• A multi-threaded program has more than one point of execution
- Multiple program counters, one for each thread
• They share the same address space
• Each thread has its own stack
• Each thread has its own private set of registers
Context switch between threads:
Each thread has its own __.
- …
program counter and set of registers;
One or more thread control blocks (TCBs) are needed to store the state of each thread
Context switch between threads:
When switching from running one thread (T1) to running the another (T2):
(3)…
- The register state of T1 be saved
- The register state of T2 restored
- The address space remains the same
Why Use Threads?:
(2)…
- Parallelism
• Divide a task among several threads
• On a system with multiple processors threads can work in parallel with each other - Overlap of I/O with other activities within a single program
• When one thread requests I/O from the system, switch to another thread ready to work
• Similar to multiprogramming across different processes
Parallelism:
(2)…
- Divide a task among several threads
* On a system with multiple processors threads can work in parallel with each other
Overlap of I/O with other activities within a single program:
(2)…
- When one thread requests I/O from the system, switch to another thread ready to work
- Similar to multiprogramming across different processes
The stack of the relevant thread:
There will be one stack per thread:
A Single-Threaded Address Space:
(3)…
- The code segment:
where instructions live - The heap segment:
contains malloc’d data dynamic data structures (it grows downward) - The stack segment:
(it grows upward) contains local variables arguments to routines, return values, etc.
Example: Creating a Thread
#include [LTS]pthread.h> void *mythread(void *arg) { printf("%s\n", (char *) arg); return NULL; } int main(int argc, char *argv[]) { if (argc != 1) { fprintf(stderr, "usage: main\n"); exit(1); } pthread_t p1, p2; printf("main: begin\n"); Pthread_create(&p1, NULL, mythread, "A"); Pthread_create(&p2, NULL, mythread, "B"); // join waits for the threads to finish Pthread_join(p1, NULL); Pthread_join(p2, NULL); printf("main: end\n"); return 0; }
How to create and control threads?: #include [LTS]pthread.h> int pthread_create( pthread_t* thread, const pthread_attr_t* attr, void* (*start_routine)(void*), void* arg);
• thread: \_\_ • attr: \_\_ - \_\_ • start_routine: \_\_ • arg: \_\_ - a void pointer allows us to \_\_ • return value: \_\_
• thread: Used to interact with this thread
• attr: Used to specify any attributes this thread might have
- Stack size, Scheduling priority, …
• start_routine: the function this thread start running in
• arg: the argument to be passed to the function (start routine)
- a void pointer allows us to pass in any type of argument
• return value: on success, returns 0; on error, it returns an error number, and the contents of *thread are undefined
Wait for a thread to complete: int pthread_join(pthread_t thread, void **value_ptr);
• thread: __
• value_ptr: __
- Because pthread_join() routine changes the value, __
• thread: Specify which thread to wait for
• value_ptr: A pointer to the return value
- Because pthread_join() routine changes the value, you need to pass in a pointer to that value
Example: Dangerous code
• Be careful with __ from a thread
(2)…
how values are returned
- Variable r is allocated on the stack
- ## When the thread returns, r is automatically de-allocatedvoid *mythread(void *arg) {
myarg_t *m = (myarg_t *) arg;
printf(“%d %d\n”, m->a, m->b);
myret_t r; // Danger!!!!! Why is this bad?
r.x = 1;
r.y = 2;
return (void *) &r;
8 }
Example: Simpler Argument Passing to a Thread
Just passing in a single value: void *mythread(void *arg) { int m = (int) arg; printf(“%d\n”, m); return (void *) (arg + 1); }
int main(int argc, char *argv[]) { pthread_t p; int rc, m; pthread_create(&p, NULL, mythread, (void *) 100); pthread_join(p, (void **) &m); printf(“returned %d\n”, m); return 0; }
Threading: Shared Variables
static volatile int counter = 0; // shared global variable // mythread() // Simply adds 1 to counter repeatedly, in a loop. No, this is not how you would add 10,000,000 to a counter, // but it shows the problem nicely. void *mythread(void *arg) { printf("%s: begin\n", (char *) arg); int i; for (i = 0; i < 1e7; i++) { counter = counter + 1; } printf("%s: done\n", (char *) arg); return NULL; // main() // Just launches two threads (pthread_create) // and then waits for them (pthread_join) int main(int argc, char *argv[]) { pthread_t p1, p2; printf("main: begin (counter = %d)\n", counter); Pthread_create(&p1, NULL, mythread, "A"); Pthread_create(&p2, NULL, mythread, "B"); Pthread_join(p1, NULL); // join waits for the threads to finish Pthread_join(p2, NULL); printf("main: done with both (counter = %d)\n", counter); return 0; }
Race condition:
Example with two threads and counter = 50
• counter = __
• We expect the result to be __
counter + 1 (runs twice, once for each thread);
52
Critical section:
(3)…
- A piece of code that accesses a shared variable and must not be concurrently executed by more than one thread
- Multiple threads executing critical section can result in a race condition
- Need to support atomicity for critical sections (mutual exclusion)
Locks ensure that any such critical section executes as if…
• An example: the canonical update of a shared variable:
balance = balance + 1;
• Add some code around the critical section:
lock_t mutex; // some globally-allocated lock ‘mutex’
…
lock(&mutex);
balance = balance + 1;
unlock(&mutex);
Locks provide __ to a critical section.
- Interface
int pthread_mutex_lock(pthread_mutex_t *mutex);
int pthread_mutex_unlock(pthread_mutex_t *mutex); - Usage (w/o lock initialization and error check)
pthread_mutex_t lock;
pthread_mutex_lock(&lock);
x = x + 1; // or whatever your critical section is
pthread_mutex_unlock(&lock);
- No other thread holds the lock → the thread will acquire the lock and enter the critical section
- If another thread hold the lock → the thread will not return from the call until it has acquired the lock
- No other thread holds the lock → __
* If another thread hold the lock → __
the thread will acquire the lock and enter the critical section;
the thread will not return from the call until it has acquired the lock
Lock Initialization:
All locks must be __.
2 ways:
…
- One way: using PTHREAD_MUTEX_INITIALIZER
- The dynamic way: using pthread_mutex_init()
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
int rc = pthread_mutex_init(&lock, NULL);
assert(rc == 0); // always check success!
Lock Error Checking:
• Always check the __.
return value for errors when calling lock and unlock --- • An example wrapper // Use this to keep your code clean but check for failures // Only use if exiting program is OK upon failure void Pthread_mutex_lock(pthread_mutex_t *mutex) { int rc = pthread_mutex_lock(mutex); assert(rc == 0); }
Condition variables are useful when __.
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
int pthread_cond_signal(pthread_cond_t *cond);
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex); int pthread_cond_signal(pthread_cond_t *cond);
- pthread_cond_wait
(2) … - pthread_cond_signal
(1) …
- Put the calling thread to sleep
- Wait for some other thread to signal it;
• Unblock at least one of the threads that are blocked on the condition variable
Condition Variable Example: A thread calling wait routine: pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; pthread_cond_t init = PTHREAD_COND_INITIALIZER; pthread_mutex_lock(&lock); while (initialized == 0) pthread_cond_wait(&init, &lock); pthread_mutex_unlock(&lock);
(2)…
• The wait call releases the lock when putting said caller to sleep
• Before returning after being woken, the wait call re-acquire the lock
—
• A thread calling signal routine
pthread_mutex_lock(&lock);
initialized = 1;
pthread_cond_signal(&init);
pthread_mutex_unlock(&lock);
Threads:
Compiling and Running:
To compile, you must __.
include the header pthread.h
• Explicitly link with the pthreads library, by adding the –pthread flag:
prompt> gcc –o main main.c –Wall -pthread
• For more information:
man –k pthread
Thread API Guidelines:
(8)…
- Keep it simple
• Tricky thread interactions lead to bugs - Minimize thread interactions
• Each interaction should be carefully thought out and constructed with known design patters (which we will learn about) - Initialize locks and condition variables
• Otherwise code can sometime fail in strange and unpredictable ways - Check your return codes
• Failure to do so will lead to bizarre and hard to understand behavior - Be careful how you pass arguments and return values
• If you returning a reference to a variable on the stack, you are going to have a bad time - Each thread has its own stack
• To share data between threads, the values must be in the heap or a global variable - Always use condition variables to signal between threads
• It’s tempting to use a simple flag, but don’t do it - Use the man pages
• Highly informative with good examples
Locks: The Basic Idea
Lock variable holds the state of the lock:
(2)…
- available (or unlocked or free)
• No thread holds the lock - acquired (or locked or held)
• Exactly one thread holds the lock and presumably is in a critical section
lock():
(3)…
• Try to acquire the lock
• If no other thread holds the lock, the thread will acquire the lock
• Enter the critical section
- This thread is said to be the owner of the lock
Other threads are prevented from entering the critical section while __.
(2)…
the first thread holds the lock;
- Other threads will block on the call to lock, until the lock is released
- If several threads are waiting on the lock, only one will get it when it is released
Pthread Locks - mutex:
(3)…
• The name that the POSIX library uses for a lock
• Used to provide mutual exclusion between threads
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
Pthread_mutex_lock(&lock); // wrapper for pthread_mutex_lock()
balance = balance + 1;
Pthread_mutex_unlock(&lock);
• We may be using different locks to protect different variables → Increase concurrency (a more fine-grained approach)
Building A Lock:
(2)…
- Efficient locks provided mutual exclusion at low cost
* Building a lock need some help from the hardware and the OS
Evaluating locks – Basic criteria:
(3)…
- Mutual exclusion
• Does the lock work, preventing multiple threads from entering a critical section? - Fairness
• Does each thread contending for the lock get a fair shot at acquiring it once it is free? (Starvation) - Performance
• The time overheads added by using the lock
Controlling Interrupts:
Disable Interrupts for __.
critical sections
Controlling Interrupts:
Disable Interrupts for critical sections:
(2)…
• One of the earliest solutions used to provide mutual exclusion • Invented for single-processor systems --- void lock() { DisableInterrupts(); } void unlock() { EnableInterrupts(); }
Disable Interrupts for critical sections:
• One of the earliest solutions used to provide mutual exclusion
• Invented for single-processor systems
Problems:
(3)…
• Require too much trust in applications
- Greedy (or malicious) program could monopolize the processor
• Do not work on multiprocessors
• Code that masks or unmasks interrupts is executed slowly by modern CPUs
Locks:
Why is hardware support needed?
First attempt: Using a flag denoting whether the lock is held or not • The code below has problems 1 typedef struct \_\_lock_t { int flag; } lock_t; 2 3 void init(lock_t *mutex) { 4 // 0 → lock is available, 1 → held 5 mutex->flag = 0; 6 } 7 8 void lock(lock_t *mutex) { 9 while (mutex->flag == 1) // TEST the flag 10 ; // spin-wait (do nothing) 11 mutex->flag = 1; // now SET it ! 12 } 13 14 void unlock(lock_t *mutex) { 15 mutex->flag = 0; 16 }
• Problem 1: No Mutual Exclusion (assume flag=0 to begin)
• Problem 2: Spin-waiting wastes time waiting for another thread
• So, we need an atomic instruction supported by hardware
- test-and-set instruction, also known as atomic exchange
Test And Set (Atomic Exchange):
An instruction to support the creation of simple locks
int TestAndSet(int *ptr, int new) { int old = *ptr; // fetch old value at ptr *ptr = new; // store ‘new’ into ptr return old; // return the old value }
(4)…
• return (test) old value pointed to by the ptr
• Simultaneously update (set) said value to new
• This sequence of operations is performed atomically
• x86_64:
- xchg rax, (mem)
Evaluating Spin Locks:
- Correctness: …
- Fairness: …
- Performance: …
- Correctness: yes
• The spin lock only allows a single thread to entry the critical section - Fairness: no
• Spin locks don’t provide any fairness guarantees
• Indeed, a thread spinning may spin forever - Performance:
• For a single CPU, performance overheads can be quite painful
• If the number of threads roughly equals the number of CPUs, spin locks work reasonably well
Compare-And-Swap:
Test whether the value at the address(ptr) is equal to expected
(3)…
• If so, update the memory location pointed to by ptr with the new value
• In either case, return the actual value at that memory location
• x86_64
- cmpxchg
Fetch-And-Add
Atomically increment a value while returning the old value at a particular address
__ can be built with fetch-and-add
• __ → fairness
Ticket lock;
Ensure progress for all threads
Hardware-based spin locks are __ and they __.
simple;
work
Spinning:
• Hardware-based spin locks are simple and they work
• In some cases, these solutions can be quite inefficient
- Any time a thread gets caught spinning, it __
wastes an entire time slice doing nothing but checking a value
A Simple Approach: Just Yield:
When you are going to spin, __:
give up the CPU to another thread;
- OS system call moves the caller from the running state to the ready state
- The cost of a context switch can be substantial and the starvation problem still exists
Using Queues: Sleeping Instead of Spinning:
Queue to __.
• park()
- __
• unpark(threadID)
- __
keep track of which threads are waiting to enter the lock;
• park()
- Put a calling thread to sleep
• unpark(threadID)
- Wake a particular thread as designated by threadID
Wakeup/waiting race:
Thread A releases the lock just right before Thread B calls park()
• Thread B could potentially __.
• Solaris solves this problem by adding a third system call:
(2)…
sleep forever;
setpark()
• By calling this routine, a thread can indicate it is about to park
• If the thread happens to be interrupted and the lock is freed before park is actually called, the subsequent park returns immediately instead of sleeping
Two-Phase Locks:
A two-phase lock realizes that spinning can be useful if __.
the lock is about to be released
Two-Phase Locks: A two-phase lock realizes that spinning can be useful ifthe lock is about to be released. 1. First phase: (2)... 2. Second phase: (2)...
- First phase
• The lock spins for a while, hoping that it can acquire the lock
• If the lock is not acquired during the first spin phase, a second phase is entered, - Second phase
• The caller is put to sleep
• The caller is only woken up when the lock becomes free later
Condition Variables:
(2)…
• There are many cases where we wish to have coordination between threads
• A thread wishes to check whether a condition is true before continuing its execution
—
• Example:
• A parent thread might wish to check whether a child thread has completed
• This is often called a join()
How to wait for a condition:
__.
Condition variable – an object used to wait for some condition to be true
Condition variable – an object used to wait for some condition to be true:
- Waiting on the condition variable
(2) … - Signaling on the condition variable
(1) …
- Waiting on the condition variable:
• An explicit queue that threads can put themselves on when some state of execution is not as desired
• The thread is no longer running, freeing up the CPU to run another thread - Signaling on the condition variable:
• Some other thread, when it changes said state, can wake one of those waiting threads and allow them to continue
The wait() call takes a __ as a parameter
mutex
The wait() call takes a mutex as a parameter (3)...
- The wait() call release the lock and put the calling thread to sleep
- When the thread wakes up, it must re-acquire the lock
- It is assumed the thread is holding the lock with signal() is called
Parent waiting for Child: Use a condition variable:
Parent:
(2)…
- Create the child thread and continues running itself
- Call into thr_join() to wait for the child thread to complete
• Acquire the lock
• Check if the child is done
• Put itself to sleep by calling wait()
• Release the lock
Parent waiting for Child: Use a condition variable:
Child:
(2)…
- Print the message “child”
- Call thr_exit() to wake the parent thread
• Grab the lock
• Set the state variable done
• Signal the parent thus waking it
The Importance of the State Variable done:
Can you think of a scenario where we could run into problems?
(3)…
Imagine the case where the child runs immediately
• The child will signal, but there is no thread sleeping on the condition
• When the parent runs, it will call wait and be stuck
• No thread will ever wake it, sad!
1 volitile int done = 0; 2 3 void thr_exit() { 4 done = 1; 5 Pthread_cond_signal(&c); 6 } 7 8 void thr_join() { 9 if (done == 0) 10 Pthread_cond_wait(&c); 11 }
25 int main(int argc, char *argv[]) { 26 printf("parent: begin\n"); 27 pthread_t p; 28 Pthread_create(&p, NULL, child, NULL); 29 thr_join(); 30 printf("parent: end\n"); 31 return 0; 32 }
Can you find the bug?
…
The issue here is a race condition
1. The parent calls thr_join()
• The parent checks the value of done
• It will see that it is 0 and try to go to sleep
• Just before it calls pthread_cond_wait to go to sleep, the parent is interrupted and the child runs
2. The child changes the state variable done to 1 and signals
• But no thread is waiting and thus no thread is woken
• When the parent runs again, it sleeps forever, sad!
The Producer / Consumer (Bounded Buffer) Problem:
- Producer
(2) … - Consumer
(1) …
- Producer
• Produces data items
• Wishes to place data items in a buffer - Consumer
• Grabs data items out of the buffer to consume them in some way
—
Example: Multi-threaded web server
• A producer puts HTTP requests into a work queue
• Consumer threads take requests out of this queue and process them
Producer/Consumer (non-working): • Put: ... • Get: ... • Producer: ... • Consumer: ... • Need...
- Put – Only put data into the buffer when count is zero (i.e., when the buffer is empty)
- Get – Only get data from the buffer when count is one (i.e., when the buffer is full)
- Producer – puts an integer into the shared buffer loops number of times
- Consumer – gets the data out of that shared buffer
- Need synchronization between the producer and consumer
Producer/Consumer: Single CV and If Statement:
1. A single condition variable cond and associated lock mutex
• Works if __.
2. What happens if that is not the case (e.g., 2 consumers, 1 producer)?
(3)…
3. Recheck state (in a while loop) upon returning from wait!
there is one producer and one consumer;
- C1 runs and waits, P1 puts an item in and signals C1
- Before C1 gets to run, C2 sneaks in and consumes the item, setting count to 0
- When C1 runs, no more items left, sad!
Thread Trace: Broken Solution
The problem arises for a simple reason:
…
After the producer woke T_c1, but before T_c1 ever ran, the state of the bounded buffer changed by T_c2_.