Chapter 4 - Pthreads Flashcards
What is a shared memory, from the programmers point of view?
All cores can access all the memory locations.
What is a critical section?
The code that updates a shared resource, that can only be updated by one thread at a time.
What are threads?
An instance of a program running on a processor. (In MPI this is called a process)
Draw a shared-memory system.
Why is a thread a “lighter-weight” process?
It contains just an instance of the running program, not the additional components (memory, descriptors, etc.).
They do have their own stack and local variables
What does a process consist of?
A process is an instance of a running program.
Stack- and heap memory
Descriptors of resources allocated for process (stdin, stdout)
Security info
State info (ready, waiting, register contents, …)
What are POSIX threads?
Standard for unix-like OS’es
It specifies an API for multithreaded programming.
How are pthread files compiled?
Like normal C files
Might need to link pthread library
gcc -o hello hello.c -lpthread
What is the pthread header file?
include <pthread.h></pthread.h>
What are global variables in thread-programs?
Variables that are shared between all threads.
How are threads started in pthread programs?
Threads are started by the program executable - meaning code needs tostart the threads itself.
What is the datastructure of threads?
pthread_t
What function is used to start threads?
pthread_create(
pthread_t* thread,
const pthread_attr_r* attr,
void* (start_routine)(void),
void* args_p
)
attr: Can pass NULL
start_routine: The function to be run by the thread
args_p: arguments to pass to the function being run, thread rank is often passed here.
How should functions being run by threads be declared?
void* thread_function(void* args_p)
Give an example of how threads can be created and how their rank is passed to them
int rank=0;
pthread_t thread = malloc(sizeof pthread_t);
pthread_create(&thread, NULL, &func, (void*) rank)
What kind of parallelism is most often used with pthreads?
SIMD, but there is no technical reason for threads only to run the same programs. It just is most often done.
Threads still do thread-branching within a thread.
What is a main thread?
Thread that runs the main-function
Does threads need to return something?
They do not need to return something useful, but because pthread_create have the return type(void*), the thread should return something. This can be NULL
How is threads stopped?
pthread_join(
thread_t* t,
void** ret_val
)
When join is called by main thread, the program will wait until the corresponding thread has completed.
Join also frees the resources associated with the thread.
ret_val: Used to retrieve return value
What are zombie threads?
Threads that are never joined, and their resources never freed. These wastes resources
What does pthread_detach do?
Used when a program does not need to wait for a thread to finish.
The function call indicates to the htread that its resources should be freed upon termination.
How can error checking be done using pthreads?
The pthread_create function have several return values that indicate different errors
What is a thread pool?
The main threads start all threads it think it will be needing, at the beginning of the program.
The threads sit idle when they finish their work, instead of terminating. When a new requests comes, an idle thread can respond to it.
This reduces thread-creation overhead
What is a race condition?
When multiple threads access the same shared memory, and at least one thread execute an update, the result may be unpredictable and we can get errors.
How can we ensure that a critical section is only entered by one thread at the time?
Flag + busy waiting: Can e.g. keep value of rank that is allowed in critical section
mutexes
semaphores
What is busy-waiting?
A thread repeatedly tests a condition, but does not do any useful work until the condition has been met.
How can busy-waiting be coded?
while(flag != my_rank);
No loop body
Is there a situation where using while loops to busy work might not work?
If a compiler optimizes the code and changes the order of instructions in a way that the critical section is executed before the while-loop checks the condition.
If while loops are used to busy wait, compiler optimizations should be turned off. The compiler does not know about multiple threads being used.
What is a spinlock?
When a thread is spinning on the while-loop condition, eating up CPU resources.
This can happen if the thread that is currently accessing the critical section gets preempted by the CPU and thereby delayed.
Why is busy-waiting not optimal for performance?
Can get spinlocks
Turning off compiler optimizations can hurt performance alot.
Howcan serial execution be faster than using multiple threads and busy-waiting?
Thread creation/joining has overhead. This is not the main factor though.
When using busy waiting, all the threads are alternating between execution and waiting. This can increase the run time alot.
Why does critical sections reduce performance?
Because execution must effectively be serialized in critical sections.
To increase performance, we should try to minimize the number of times a critical section is entered.
What is a mutex?
Mutual exclusion
Guarantees exclusive access to critical section.
A data type consisting of a variable and a couple functions used to achieve this.
pthread_mutex_t
How is a mutex created?
int pthread_mutex_init(
pthread_mutex_t* m,
const pthread_mutexattr_t* attr
)
Attr: Can be NULL
What is a static mutex initialization?
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER
easier initialization method, fine in many cases
What is pthread_mutex_destroy?
Method called when a program finishes using a mutex
pthread_mutex_destroy(
pthread_mutex_t* m
)
How are mutexes used to access critical sections
A thread will lock the mutex
A thread will do the work
A thread unlocks the mutex
What is pthread_mutex_lock?
pthread_mutex_lock(
pthread_mutex_t* m
)
used by threads to lock a mutex
Causes the thread to lock until the critical section is clear.
What is pthread_mutex_unlock
pthread_mutex_unlock(
pthread_mutex_t* m
)
Used by threads to unlock mutexes
Notifies system that the thread has left the critical section.
What is a difference between busy-waiting and using mutexes?
Busy-waiting requires specific conditions, for example specifies what rank can execute the section.
The order of which mutexes are required can be random. Nothing specifies rank. Accesses are scheduled by the system. Not necessarily in the order they called lock().
What is a property that can be useful with busy-waiting?
We know the order in which threads will complete the critical section.
What are semaphores?
Special type of unsigned int.
What is a binary semaphore?
A semaphore that only take on the values 0 and 1
0: Corresponds to locked mutex
1: Correspond to unlocked
How do you use a binary semaphore as a mutex?
Initialize semaphore to 1
Before critical section: sem_wait
After critical section: sem_post
What does sem_wait do?
If semaphore is non-zero, it gets decremented
If semaphore is 0, thread waits until it is not
What does sem_post do?
Increments semaphore
What is the difference between a semaphore and a mutex?
There is no owner of a semaphore
How are semaphores initialized?
int sem_init(
tem_t* s,
int shared,
unsigned init_val
)
shared: Whether semaphore shall be shared between threads or processes.
if we pass constant 0: shared amongst threasds
What does sem_destroy do?
int sem_destroy(sem_t* semaphore)
What header file does semaphores need?
include <semaphore.h></semaphore.h>
What is producer-consumer synchronization?
When one thread cannot proceed until another has finished some work.
Does not necessarily mean involving a critical section.
In what situations can counting semaphores be useful (more values than binary)
For example when we want to restrict the amounts of resources used.
If we only have 4 cores, we can create a semaphore that allows creating maximum 4 threads. This is an alternative to thread pooling.
What is a barrier?
A point of synchronization to make sure all threads are at the same place in the program.
No process can proceed without everyone having reached it.
When can barriers be useful?
Debugging
When timing an application. Synchronize all threads and then start timer. Use the longest time
How can a barrier be implemented using a mutex and busy-waiting?
Use a shared counter protected by mutex.
When counter equals number of threads - threads can leave busy wait loop
while(counter < thread_count);
What are the downsides of using mutex+busy wait to create barriers?
Wastes CPU resources.
If more threads than cores - can significantly reduce performance.
If we cant to reuse the counter to create a second barrier, resetting it to 0 will most likely lead to some threads getting stuck in the first while loop.
This means we need one counter for each barrier
How can barriers be created using semaphores?
Counter to indicate how many threads are in the barrier.
2 semaphores:
- count_sem: protects counter
- barrier_sem: blocks threads that have entered barrier
count_sem is initialized to 1
When a thread has access to the counter - it checks if it is less than thread_count, and possibly increments it, and releases lock (sem_post(&count_sem))
Then it proceeds to block in sem_wait(&barrier_sem)
If the counter is n_thread - 1, the thread is the last to enter barrier. So it can reset the counter to 0 and unlock count_sem calling sem_post(&count_sem). To notify all other threads that they can proceed, it calls sem_post(&barrier_sem) for each thread that are blocked in sem_wait(&barrier_sem).
What pros has semaphores to mutexes when implementing barriers
Does not cause CPU cycles to be wasted.
Counter can be reused because it is reset before other threads are released.
count_sem can be reused because it too gets reset
barrier_sem cannot be reused
What is the code to create barriers using semaphores
int counter; // init to 0
sem_t count_sem; // init to 1
sem_t barrier_sem; // init to 0
void* thread_func() {
…
sem_wait(&count_sem); if(counter === n_threads - 1){ counter = 0; // Reset counter sem_post(&count_sem); // This is the last thread to call post on this semaphore, meaning it gets reset to 0 for(int j = 0; j < n_thread - 1; j++) { sem_post(&barrier_sem); } } else { counter++; sem_post(&count_sem); sem_wait(&barrier_sem); } }
What are condition variables?
A data object that allows a thread to suspend execution until a certain event or condition occurs.
When this event occurs, another thread can signal the thread to “wake up”.
A condition variable is always associated with a mutex.
What is the pseudo code of a construct where condition variables are used?
int mutex;
if condition
signal threads()
else
unlock mutex and block
unlock mutex
What does pthread_cond_signal do?
pthread_cond_signal(
pthread_cond_t *c
)
unblocks one of the blocking threads
What does pthread_cond_broadcast do?
pthread_cond_broadcast(
pthread_cond_t* c
)
unblocks all waiting threads
Semaphores would need a for-loop for this
What does pthread_cond_wait do?
pthread_cond_wait(
pthread_cond_t* c,
pthread_mutex_t m
)
unlocks the mutex and cause executing thread to block until someone else calls signal/broadcast.
What happens when a thread is unblocked when using condition variables?
It reaquires the mutex.
What instructions does pthread_cond_wait in effect implement?
pthread_mutex_unlock(&m)
wait_on_signal(&c)
pthread_mutex_lock(&m)
What is the code to implement barriers using condition variables?
int counter = 0;
pthread_mutex_t m;
pthread_cond_t c;
void func(){
pthread_mutex_lock(&m)
counter++;
if(counter == n){ counter = 0; pthread_cond_broadcast(&c) } else { while(pthread_cond_wait(&c, &m) != 0) } pthread_mutex_unlock(&m);
}
wait() needs to be in a while loop because of spurious wake-up
What is a spurious wake-up?
When events other than cond_signal/cond_broadcast wakes up a thread.
How are cond variables initialized?
pthread_cond_init(
pthread_cond_t* c,
const pthread_condattr_t* a
)
How are cond variables destroyed?
pthread_cond_destroy(
pthread_cond_t* c
)
How can cond variables be initialized with default attributes?
pthread_cond_t c = PTHREAD_COND_INITIALIZER
When is conditional variables preferable to semaphores?
When protected application state cannot be represented by an unsigned integer counter
What are read-write locks?
Provides 2 lock functions.
1: Lock for reading: Allows multiple threads to obtain the lock for reading
2: Lock for writing: Only one thread can obtain the lock when writing.
If a thread want to write, but other threads have read-lock. It blocks
If one thread has write block, and other want read/write-lock, they wait
What are the read-write lock functions syntax?
pthread_rwlock_init(
pthread_rwlock_t* l,
cont pthread_rwlockattr_t* attr
)
pthread_rwlock_rdlock(
pthread_rwlock_t* l
)
pthread_rwlock_wrlock(
pthread_rwlock_t* l
)
pthread_rwlock_unlock(
pthread_rwlock_t* l
)
pthread_rwlock_destroy(
pthread_rwlock_t* l
)
What is the syntax of static init of rw-locks?
pthread_rwlock_t lock = PTHREAD_RWLOCK_INITIALIZER
What are the implemented functionality of read-write locks?
Data structure of 2 cond. variables, (readers/writers), and a mutex. The structure also include members that indicates:
How many readers own the lock (currently reading)
How many readers are waiting to obtain the lock
Whether a writer owns the lock
How many writers are waiting
How does the read-write lock functionality work?
What is the cache coherence problem?
If two threads access the same data, one then updates it, and then the next reads it. What value does the second read?
There are 3 possible versions of the variable now, the one in main memory, the one in thread 1 and the one in thread 2.
For the second thread to know the value it has in cache is updated, and therefor invalid, we need to implement invalid bits and policies for memory updates.
Cache coherence can have big performance consequences for multithreaded systems.
What is false sharing?
Two threads with separate caches access different variables that are stored within the same cache line.
If one of the threads update their value, the whole cache line will be invalid.
The other thread needs to fetch the cache line from memory, even though it is not using the same variable.
Threads aren’t sharing a variable, but the behaviour of the threads are as if they were.
When is a block of code thread safe?
When it can be executed by multiple threads without causing problems.
What is oversubscription?
Running more threads/processes than available cores
What resources does a thread have access to?
Its own stack and local variables
Shared heap and data segment
What is the PRAM model?
proposes 4 policies on how to handle multiple threads accessing/writing to shared resources
Common value: allows execution if all assign same value
Arbitrary value: Assigns the value from one of the attempts, can be any of them
Priority: Each thread have ranking, highest ranked assignment writes to value
Reduction: A commutative operation is applied to all attempts, result is stored
What flag is added when compiling pthread programs?
-pthread
What does void in func_name(void) mean?
Exactly zero arguments
What are atomic instructions?
Instructions wired into the CPU and interconnect fabric, in a way that they cannot be interrupted
What is pthread_barrier_t
Behaves like broadcast barrier
pthread_barrier_init(&var, NULL)
pthread_barrier_destroy(&var)
pthread_barrier_wait(&var)
- When every thread reaches wait, var is reset, and threads are resumed
What needs to be included in the file to use pthread barriers?
define _GNU_SOURCE before
#include <pthread.h></pthread.h>