Chapter 3 - MPI Flashcards
What are the two types of systems in the world of MIMD?
Distributed memory: The memory associated with a core is only accessible to that core.
Shared memory: Collection of cores connected to a global shared memory.
What is message passing?
Way of communication for distributed memory systems.
One process running on one cores communicates through a send() call, another process on another core calls receive() to get the message.
What is MPI
Message Passing Interface
Library implementation of message passing communication
What are collective communication?
Functions that allow for communication between multiple (more than 2) processes.
What is a rank?
A non-negative identifier of a process running in MPI.
n number of processes -> 0, 1, …, (n-1) ranks
What library must be included to use MPI
include <mpi.h></mpi.h>
How is MPI initialized?
MPI_Init(int* argc_p, char*** argv_p);
argc_p and argv_p: pointers to arguments to main, argc and argv
If program does not use these, NULL are passed to both
In main() using arguments:
MPI_Init(&argc, &argv);
How do you get the communicator size in MPI?
MPI_Comm_size(MPI_Comm comm_name, int* size)
What is the name of the global communicator?
MPI_COMM_WORLD
Set up by MPI_Init();
How do you get a process’s rank within a communicator?
MPI_Comm_rank(MPI_Comm comm_name, int* rank)
How is a MPI program finalized?
MPI_Finalize(void);
Any resource allocated to MPI is freed.
What does
mpiexec -n <n> ./program</n>
do when compiling a MPI program?
mpiexec tells the system to run the program with <n> instances of the program</n>
What is a MPI communicator?
A collection of processes that can send messages to each other.
How can MPI produce SPMD programs?
Processes branch out doing different tasks based on their ranks.
if-else
rank = 0 can print, rank = 1 send, rank = 2 receive
What is the syntax of MPI_Send()
MPI_Send(
void* buffer,
int message_size,
MPI_Datatype type,
int dest,
int tag,
MPI_Comm communicator
):
buffer: holds the content of the message to be sent
size: Number of elements to send from the buffer
type: MPI_CHAR, MPI_DOUBLE, etc.
dest: Destination rank, who is receiving the message
tag: non-negative int, can be used to destinguish messages that are otherwise identical
What is the syntax of MPI_Recv
int MPI_Recv(
void* msg_buf,
int size,
MPI_Datatype type,
int source,
int tag,
MPI_Comm communicator,
MPI_Status* status
)
msg_buf: Buffer to receive message in
size: Number of elements to receive
type: Types of elements in message
source: Source rank, rank that sent message
tag: Tag should match the tag from the send
communicator: Must match the communicator at the send
status: When not using status MPI_STATUS_IGNORE is passed
What conditions must be met for a MPI to be successfully sent by process a and received by process b?
dest = b
src = a
comm_a = comm_b
tag_a = tag_b
buffer_a|buffer_b + size_a|size_b + type_a|type_b must be compatible
Most of the time, if type_a=type_b and size_b >= size_a, the message will be successfully received
What is a wildcard argument in MPI communication
If a receiver will be receiving multiple messages from multiple source ranks, and it does not know the order it will receive, it can loop through all the MPI_recv() calls and pass the wildcard argument: MPI_ANY_SOURCE to allow any order of ranks to send messages.
Similarly, if a process will receive multiple messages from another process, but with different tags, it can do the same but with the wildcard argument MPI_ANY_TAG
Only receivers can use wildcard arguments
There is no communicator wildard argument
What is MPI_Status used in MPI_Recv?
A struct with atleast the three members:
MPI_SOURCE
MPI_TAG
MPI_ERROR
Before recv call, create status pointer:
MPI_Status status;
MPI_Recv(…, &status);
These are useful if a process uses wildcards and now need to figure out either the source or tag of a message. These attributes can then be examined.
What is MPI_Get_count() used for?
Used to figure out how many elements of the provided type was received in the message
MPI_Get_count(
MPI_Status* status, (in)
MPI_Datatype type, (in)
int* count (out)
)
status: Status struct passed to recv()
type: Type passed in recv
count: Number of elements received in the message
What happens if the process buffers a MPI_Send message?
MPI puts the complete message into its internal storage.
The MPI_Send() call will return.
The message might not be in transmission yet, but as it is now stored internally, we can use the send_buffer for other purposes if we want to.
What happens if the process block the MPI_Send message?
The process will wait until it can begin transmitting the message.
The MPI_Send() call might not return immediately.
When will MPI_Send block?
It depends on the MPI implementation.
But many implementations have a “cutoff” message size. If the size is within this, it will be buffered. If it exceeds the cutoff-size the message will block.
Does MPI_Recv block?
Yes, unlike MPI_Send, when MPI_Recv returns we know the message has been fully received.
What does it mean that MPI messages are non-overtaking?
If one process a sends 2 messages to process b, the first message must be available to b before the second one is.
Messages from different processes does not care which was sent first.
What is a pitfall with MPI_Recv / MPI_Send in the context of blocking?
If the MPI_Recv does not have a matching MPI_Send it will block forever and the program will hang.
The same can happen for a blocking send if it has no matching receiver.
If a MPI_Send if buffered and there are no matching send, the message will be lost
What is non-determinism in parallel programs?
When the output of a program vary depending on the order of which processes does computations.
How can MPI programs implement I/O to avoid non-determinism?
Make processes branch on process rank.
E.g. rank 0 can read input and send it to the remaining ranks.
All ranks can send their output to rank 0 who can print it in rank order.
In MPI, what are collective communications?
Communication functions that include all processes in a communicator.
What is point-to-point communication?
One sender and one receiver
(MPI_Senc | MPI_Recv)
What is MPI_Reduce?
Implementation of collective communication.
Generalized function that allows different operations on data that is held by all processes in a communicator
Syntax:
MPI_Reduce(
void* input_data_buf,
void* output_data_buf,
int count,
MPI_Datatype type,
MPI_Op operator,
int dest_process,
MPI_Comm comm
)
input_data_buf: local data for the process, this is used in the operation
output_data_buf: buffer to hold the output computation done by the operator
count: Number of elements to do operation on. This allows for e.g. operations on arrays
type
operator: Specifies what operation is to be done on the data
dest_process: rank to receive computed output (?)
What are some operators available for MPI_Reduce?
MPI_SUM: Optimized global sum of all local_ints
MPI_MAX: Finds the largest value from the processes
What is important to remember when using collective communication
All processes must call MPI_Reduce
Arguments passed to a collective must be compatible (e.g. dest rank)
out buffer is only used by dest rank. The other ranks still need to pass the out argument, but this can be NULL for the other processes
Where point-to-point matches on tags and communicators, collective match on communicator and the order they where called.
It is illegal to use the same buffer for input and output
What does itmean to alias arguments?
Two arguments are aliased if they refer to the same block of memory.
This is illegal in MPI if one of the is output or input/output
What does MPI_Allreduce do?
Optimized collective function that stores the output of reduce in all processes
MPI_Allreduce(
void* input_data_buf,
void* output_data_buf,
int count,
MPI_Datatype type,
MPI_Op operator,
MPI_Comm comm
)
Identical argument list to reduce() without dest rank
What is MPI_Broadcast
Collective function that allows a process to send a message to all other processes in a communicator
MPI_Bcast(
void* data_buf,
int count,
MPI_Datatype type,
int source_process,
MPI_Comm comm
)
source_process: The process with rank source_process sends its content of data_buf.
data_buf: Buffer to either send from, or if processes aren’t the source, receive the data in. Acts as both input and output
What does MPI_Scatter do?
If a communicator is doing operations on a large vector, and each process is only doing work on certain parts of the vectors, it can be expensive if one process communicates the whole vector to all processes. Because these must then allocate alot of memory to the whole vector, though they only does computations on parts of these.
MPI_Scatter reads in the complete data from one rank and sends only the needed components to the rest of the processes.
MPI_Scatter(
void* send_buf,
int send_count,
MPI_Datatype sent_type,
void* recv_buf,
int recv_count,
MPI_Datatype recv_type,
int src_process,
MPI_Comm comm
)
The function divides the data referenced in send_buf into comm_size pieces. The first piece goes to rank 0, then rank 1, and so on.
send_count needs to be the amount of data going to each process, so not the complete amount of data in send_buf.
A thing to note, is that the complete amount of data must be divisible by the number of ranks in the communicator
What does MPI_Gather do?
Function to collect all data-components from all processors into one process to get the complete data, e.g. the complete vector.
MPI_Gather(
void* send_buf,
int send_count,
MPI_Datatype sent_type,
void* recv_buf,
int recv_count,
MPI_Datatype recv_type,
int dest_process,
MPI_Comm comm
)
Same as scatter, but with a destination rank to receive all the data.
Data from rank 0 is stored in the first block of recv_buffer, send_buf in rank 1 is stored in second block of recv_buf
What is MPI_Allgather?
MPI_Allgather(
void* send_buf,
int send_count,
MPI_Datatype sent_type,
void* recv_buf,
int recv_count,
MPI_Datatype recv_type,
MPI_Comm comm
)
Function concatinates each processes send_buf and stores this in each process’s recv_buf
What are derived datatypes in MPI?
Used to represent any collection of data items in memory.
Stores the type of the items, and their relative locations in memory.
Derived datatypes consist of a sequence of basic MPI_Datatypes and a displacement for each type (from the beginning of the type).
What is the syntax of MPI_Type_create_struct?
MPI_Type_create_struct(
int count,
int array_of_blocklengths[],
int array_of_displacements[],
MPI_Datatype array_of_types[],
MPI_Datatype new_type*
)
count: Number of elements in the type
blocklengths: allows for possibility that the subtypes are arrays. if one element is an array with 5 elements. blocklength = 5
displacement: Each elements displacement from the start if the message
What does the function MPI_Get_address do?
MPI_Get_address(
void* location,
MPI_Aint* address
)
returns address of pointer location
MPI_Aint is used because this is the datatype that is big enough to store an address.