OpenMP Flashcards

1
Q

three primary API components

A

1 compiler directives
2 runtime library rountines
3 environment variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ICV thread numbers

A

OMP_NUM_THREADS
num_threads(n) pragma!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

sections

A

pragma sections, pragma section
- each sections is run by 1 thread
- implicit barrier at end of sections region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

for

A
  • no sych at beginning
  • implciit barrier at end (nowait to delete it)
  • must have no ata dependency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

schedule

A
  • used for the FOR loop
    1. static: round robin, fixed chunks of n/t
    2. dynamic: given one by one at runtime, chunks 1 (good for load balance)
    3. guided: start with bigger chunks, then smaller exponentially, one by one at runtime
    4. runtime: set at runtime
    5. auto: compiler/runtime chooses
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

sharing attributes

A
  1. private
  2. shared
  3. default
  4. firstprivate
  5. lastprivate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

list synchronization pragmas

A

barrier, masked region, single region. critical section, atomic statement, ordered contruct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

barrier

A

pragma omp barrrier

synchs all threads
can cause load imbalance, use only when needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

masked Construct

A

pragma omp masked [ filter(integer-expression)

  • only primary thread executed code, the others skip
  • no implied barrier at either end
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

single Construct

A

pragma omp single

  • implicit barrier!
  • a thread executes it
  • clauses like private(list) firstprivate(list)
  • like initializeing data structures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ciritical construct

A

pragma omp critical [(name)

restricts execution of the associated structured block to a single thread at a time
- no implicit barrier?!
- can cause load imbalance, only use when really needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

atomic statement

A

pragma omp atomic

so a memory location is updated atomically
- like critical, but less overhead, but also only 1 operation, avoid locking
-

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

ordered construct

A

pragma omp ordered

block of code that must be executed in sequential order.
- The ordered construct sequentializes and orders the execution of ordered regions while allowing code outside the region to run in parallel.
- clauses: threads (default), simd
- for!!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

omp locks

A

kinda like mutex

omp_lock_t lockvar; initializes a simple lock
omp_init_lock(&lockvar);
omp_destroy_lock(&lockvar); uninitializes a simple lock
omp_set_lock(&lockvar) waits until a simple lock is available and then sets it
omp_unset_lock(&lockvar) unsets lock
omp_test_lock(&lockvar) tests, if true sets lock

nestable locks possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

reduction

A

reduction (operator: list)

+ - * & ^ | && || max min

17
Q

correctness issues

A

data races: unsyched conflicting access to shared variables
loop dependencies: prevents parallelization in loops
aliasing: hidden loop dependencies,breaks compiler optimizations

18
Q

types of data races

A
  1. read after write
  2. write after read
  3. write after write
19
Q

types of dependecies

A
  1. true
  2. anti
  3. output
20
Q

types of loop dependecies

A

loop carried
loop independent

21
Q

list loop transformations

A

4 interchange, distribution, fusion, alignment

22
Q

loop interchange

A

swap inner and outer loop
not for loop carried dependencies
used to get better cache use (row first!)

23
Q

loop distribution

A

to eliminate loop carried dep,

24
Q

loop fusion

A

may create loop carried dep
reduce overhead!!
eliminates need of barrier

25
Q

loop alignment

A

put 1 operation before and 1 after

26
Q

why is work sharing bad?

A
  1. load imbalance (both static and dynamic scehduling have drawbacks)
  2. imbalance because of machine
  3. limited program flexibility

this means we need tasks!!

27
Q

tasks

A

pragma omp task

-inside a single region
- untied: tasks can move to different threads, may lose cache
- if: when to defer task
- sharing: default is firstprivate
- priority influences execution order

taskwait: waits for completion of immediate child tasks

taskyield: current task can be suspended (if takes too long)

dependencies: in, out, inout

PRO: load balance, simple

task granularity:
1. fine: more overhead, better resource use
2. coarse: schedule fragmentation

28
Q

omp runtime routines

A

omp_set_num_threads(N)
omp_get_max_threads()
omp_get_num_threads() size
omp_get_thread_num() rank

29
Q

omp ICVs

A

OMP_SCHEDUL
OMP_DYNAMIC
OMP_NESTED