parallel Flashcards
Handling 2D data or how to scale and zoom an image.
(One should talk about this transformation). Pay attention to the examples section (Chapter 6, slide 61 onwards).
It acts as a MPI_Send and MPI_Recv at the same time. How does it do it?
The key is the root variable; the process whose rank is the same as the root, he will act as the sender, all the others are classified as the receivers. In other words, the same function acts as a blocker for all except one process to send and unblocks them to receive.
One-sided communication
The previous communications mechanisms have disadvantadges like:
1. The transmitter and receiver must agree on how much and which messages are to be exchanged
2. It might lead, in certain cases, to an increased need for communication (polling)
On the other side, in one-sided communication, a process accesses remote data without the help of the communication partner. The programmer should decide when to open the “window” from where the process can access the data (synchronization) and, that way, it “simulates” a share memory.
A window is a channel with which processes communicate with each other; the particularity of this method is that doesn’t target a whole processor or memory but a specifical location. That way, one can read, write etc in the memory location of another processor.
what is a window in one sided communication? where will it be locared how can it be accessed?
whats the difference between collective and one sided communication in mpi?
what do origin and target mean in one sided communication?
what is an epoch? mpi
whats exposure epoch and what are required functions
how does a passive target work,what should be considered?
False sharing:
Caches are connected and have copies of certain locations of the main memory. However, two caches can have a copy of the same location and, if one changes, then the other would have wrong copy and keep working with the old copy. A way to prevent this is the MESI protocol: when a data is modified, then all other CPUs is notified and must stop and update their cache.
MESI:
○ Modified:
§ Date was changed recently and only exists in local cache
§ Date must be written back
○ Exclusive:
§ As modifed but not changed yet
○ Shared
○ Invalid
Look for more information about false sharing and how arrays and cache lines are affected
The solution is that multiple threads access the same cache line without actually sharing data but that requires and unnecessary synchronization effort and depends on the respective cache architecture. We can avoid false sharing via padding (but the target platform must be known), reorganization of data mapping or by using private variables.
Scopes of validity of variables: open mp
There are different types of visibility for the variables:
- Shared: all variables declared outside the parallel region can be accessed by ALL threads
- Private: variables created WITHIN a thread, so only this thread can see them. One the thread finished, the variable will be deleted (since the stack of the thread will be cleaned).
- Reduction: copies of the variables are created for each thread, if the threads are terminated, an operation is performed on the copies, which is stored in the variable. (for the exam: look for more information, the lecture had quite a lot of examples in openMP)
Visibilities can be changed by using pragma statements.
A problem we sometimes run into, is that when performing things in parallel, one processor reads the input data and while it’s performing calculations, another processor updates such value. Now, the first value that was read is outdated. This is called race condition.
Determining the size of the thread team: openmp
- System level: there is an environment variable named OMP_UM_THREADS, you have to set it in the dash in BASH. Sometimes better because you can change it after compiling.
- Program level: by using the function omp_set_number_threads(). This is only possible by importing omp.h
- Pragma level: #pragma num_threads = num.
Dependency types:
raw, war, waw, rar, counter
What is the difference between Image object and Buffer in OpenCL? Describe the
role of Offset in accessing the data from Image objects.
What is the difference between Image Objects and Samplers?
Which kind of data will be represented by Float4 datatype?
Can you name an example of using images for parallel processing?