Chapter 3 - MPI Flashcards

Question

What does it mean that MPI messages are non-overtaking?

Answer 1

If one process a sends 2 messages to process b, the first message must be available to b before the second one is. Messages from different processes does not care which was sent first.

Answer 2

If the MPI_Recv does not have a matching MPI_Send it will block forever and the program will hang. The same can happen for a blocking send if it has no matching receiver. If a MPI_Send if buffered and there are no matching send, the message will be lost

Answer 3

When the output of a program vary depending on the order of which processes does computations.

Answer 4

Make processes branch on process rank. E.g. rank 0 can read input and send it to the remaining ranks. All ranks can send their output to rank 0 who can print it in rank order.

Answer 5

Communication functions that include all processes in a communicator.

Answer 6

One sender and one receiver (MPI_Senc | MPI_Recv)

Answer 7

Implementation of collective communication. Generalized function that allows different operations on data that is held by all processes in a communicator Syntax: MPI_Reduce( void* input_data_buf, void* output_data_buf, int count, MPI_Datatype type, MPI_Op operator, int dest_process, MPI_Comm comm ) input_data_buf: local data for the process, this is used in the operation output_data_buf: buffer to hold the output computation done by the operator count: Number of elements to do operation on. This allows for e.g. operations on arrays type operator: Specifies what operation is to be done on the data dest_process: rank to receive computed output (?)

Answer 8

MPI_SUM: Optimized global sum of all local_ints MPI_MAX: Finds the largest value from the processes

Answer 9

All processes must call MPI_Reduce Arguments passed to a collective must be compatible (e.g. dest rank) out buffer is only used by dest rank. The other ranks still need to pass the out argument, but this can be NULL for the other processes Where point-to-point matches on tags and communicators, collective match on communicator and the order they where called. It is illegal to use the same buffer for input and output

Answer 10

Two arguments are aliased if they refer to the same block of memory. This is illegal in MPI if one of the is output or input/output

Answer 11

Optimized collective function that stores the output of reduce in all processes MPI_Allreduce( void* input_data_buf, void* output_data_buf, int count, MPI_Datatype type, MPI_Op operator, MPI_Comm comm ) Identical argument list to reduce() without dest rank

Answer 12

Collective function that allows a process to send a message to all other processes in a communicator MPI_Bcast( void* data_buf, int count, MPI_Datatype type, int source_process, MPI_Comm comm ) source_process: The process with rank source_process sends its content of data_buf. data_buf: Buffer to either send from, or if processes aren't the source, receive the data in. Acts as both input and output

Answer 13

If a communicator is doing operations on a large vector, and each process is only doing work on certain parts of the vectors, it can be expensive if one process communicates the whole vector to all processes. Because these must then allocate alot of memory to the whole vector, though they only does computations on parts of these. MPI_Scatter reads in the complete data from one rank and sends only the needed components to the rest of the processes. MPI_Scatter( void* send_buf, int send_count, MPI_Datatype sent_type, void* recv_buf, int recv_count, MPI_Datatype recv_type, int src_process, MPI_Comm comm ) The function divides the data referenced in send_buf into comm_size pieces. The first piece goes to rank 0, then rank 1, and so on. send_count needs to be the amount of data going to each process, so not the complete amount of data in send_buf. A thing to note, is that the complete amount of data must be divisible by the number of ranks in the communicator

Answer 14

Function to collect all data-components from all processors into one process to get the complete data, e.g. the complete vector. MPI_Gather( void* send_buf, int send_count, MPI_Datatype sent_type, void* recv_buf, int recv_count, MPI_Datatype recv_type, int dest_process, MPI_Comm comm ) Same as scatter, but with a destination rank to receive all the data. Data from rank 0 is stored in the first block of recv_buffer, send_buf in rank 1 is stored in second block of recv_buf

Answer 15

MPI_Allgather( void* send_buf, int send_count, MPI_Datatype sent_type, void* recv_buf, int recv_count, MPI_Datatype recv_type, MPI_Comm comm ) Function concatinates each processes send_buf and stores this in each process's recv_buf

Answer 16

Used to represent any collection of data items in memory. Stores the type of the items, and their relative locations in memory. Derived datatypes consist of a sequence of basic MPI_Datatypes and a displacement for each type (from the beginning of the type).

Answer 17

MPI_Type_create_struct( int count, int array_of_blocklengths[], int array_of_displacements[], MPI_Datatype array_of_types[], MPI_Datatype new_type* ) count: Number of elements in the type blocklengths: allows for possibility that the subtypes are arrays. if one element is an array with 5 elements. blocklength = 5 displacement: Each elements displacement from the start if the message

Answer 18

MPI_Get_address( void* location, MPI_Aint* address ) returns address of pointer location MPI_Aint is used because this is the datatype that is big enough to store an address.

Answer 19

int a, b, c MPI_Aint addr_a, addr_b, addr_c MPI_Get_address(&a, &addr_a) MPI_Get_address(&b, &addr_b) MPI_Get_address(&c, &addr_c) array_of_displacements[0] = 0; array_of_displacements[1] = addr_b - addr_a; array_of_displacements[1] = addr_c - addr_a;

Answer 20

MPI_Datatype new_type; MPI_Type_create_struct( 3, block_lengths, displacements, types, &new_type ) Then the type must be committed: MPI_Type_commit(&new_type) Finished using the type: MPI_Type_free(&new_type)

Answer 21

Commits a MPI datatype that was created using MPI_Type_create_struct MPI_Type_commit( MPI_Datatype* new_type )

Answer 22

When we are finished using a MPI type that we have created, we can free the storage it has used using: MPI_Type_Free( MPI_Datatype *used_custom_type )

Answer 23

A collective communication function that is used to synchronize processes. No process will return from calling it until every process in the communicator has started calling it. Does not guarantee communication has finished, messages can still be in transit. written data that is waiting in a buffer, will not be flushed by the barrier. It stays the same. pending requests for work will not be cleaned up by barriers, you must wait for their completion to finalize them MPI_Barrier(MPI_Comm comm)

Answer 24

Ratio of serial runtime to parallel S = T_serial / T_parallel p: number of processes (?)

Answer 25

When a parallel program running p processes runs p times faster than the serial program.

Answer 26

Per process speedup S = T_serial / T_parallel E = S / p = T_serial / p*T_parallel

Answer 27

Linear speadup gives efficiency of p/p = 1

Answer 28

A MPI constant used in point-to-point communication as src/dest rank. When the constant is used, no communication will take place.

Answer 29

A program that relies on MPI-buffering to avoid deadlocks when Sends- and receives are waiting for each other. Unsafe programs may hang, crash or deadlock

Answer 30

An synchronous MPI_Send call that guarantees to block until the matchin receive starts Same arguments as MPI_Send

Answer 31

If the MPI_Send calls are replaced with MPI_Ssend we can see if the program hangs. If it does not, the program was safe.

Answer 32

Processes first sending a message annd then waiting to receive. This will cause them to wait in a circle. A way to solve this is to vary in what order ranks sends or receives. If half of the ranks sends first and then receives, and the other half first receives and then sende, there will be no deadlock.

Answer 33

MPI's implementation of safe sends- and receives. MPI_Sendrecv( void* send_buf, int send_size MPI_Datatype send_type, int dest, int send_tag, void* recv_buf, int recv_size MPI_Datatype recv_type, int src, int recv_tag, MPI_Comm comm, MPI_Status* status ) Guarantees no deadlock

Answer 34

Local: Values specific to a process Global: Values available to all processes

Answer 35

Overhead due to additional work that is not done in serial programs. In MPI, this would be the work done in communicating between processes

Answer 36

If you can increase the problemsize n so that efficiency doesn't decrease as p is increased.

Answer 37

SPMD Can branch by ranks to do different things

Answer 38

Standard: Default for MPI_Send Synchronized: Send-function will block until reception is acknowledged Buffered: Explicitly manage the memory that's used for send/recv Ready: Assume that the receiver has already initiated the receive when the send() call is made

Answer 39

SPMD P copies of the same program can do different things because of their identity number

Answer 40

Each rank has a set of coordinates. Grid structure, 2 neighbours in each direction, up/down/left/right, possibly in 3D

Answer 41

Send() and recv() call immediately returns with a request, so execution can continue. To make sure if communication was successful, you must issue a wait-for-completion call for the request

Answer 42

A full memory space Stack, heap, data (includes rank), text

Answer 43

mpirun -np 4 ./my_program

Answer 44

All ranks in the communicator MUST participate in the collective operation

Answer 45

An operation that forces all committed work to be completed before continuing

Answer 46

Total exchange - from everyone to everyone

Answer 47

Yes, all collective functions can be implemented using normal send/recv

Answer 48

When the message exceeds a size making a copy of the message takes longer than sending message right away. After this message size, MPI_Send will switch to blocking mode

Answer 49

Synchronized mode of Send Does not return until receiver starts receiving Synchronizes progress between communicating processes

Answer 50

Buffered Send mode Lets you allocate buffer yourself, so that you can make it long and contiguous in memory Useful when you're sending a lot of tiny messages at a time. This usually causes tiny buffer allocations and deallocations, which takes time and fragments heap-memory Buffer must be registered before the send call MPI_Buffer_attach(buffer, buffer_size) MPI_Buffer_detach(&buffer, &buffer_size)

Answer 51

Has liberty to bypass protocals that establish whether the recipient is ready. Can be used when the programmer is 100% sure that the receiver has already made the receive call

Answer 52

MPI_Isend( void* buffer, int count, MPI_Datatype type, int dest, int tag, MPI_Comm communicator, MPI_Request *request ) Return immediately. Message sending gets put in the background and executed later at MPIs own convenience Program can do something else in the meantime MPI_Wait(MPI_Request *req, MPI_Status *stat) is called when you need to make sure the transfer was successful. MPI_Waitall(n_reqs, *array_reqs, MPI_STATUS_IGNORE) Can be used if multiple messages was sent, and you want to wait for all at the same time

Answer 53

You can overlap computetion and communication. Communication is expensive, but this allows you to do useful work in the meantime.

Answer 54

MPI_Isend MPI_Issend MPI_Ibsend MPI_Irsend MPI_Irecv

Answer 55

If the same communication pattern is going to be used over and over, MPI can prepare these in advance and you can activate them later. This is the case for ISend int MPI_Send_init(, MPI_Request *req) int MPI_Recv_init(, MPI_Request *req) Triggered: MPI_Start(MPI_request *req)

Answer 56

Returns a number of seconds representen as a double-precision float value

Answer 57

MPI_Barrier(); double start = MPI_Wtime(); /... work .../ double end = MPI_Wtime(); Elapsed time = end - start (on this rank)

Answer 58

Bytes / seconds

Answer 59

seconds / byte How much transfer time is added for sending additional bytes

Answer 60

Types with regular layout Vector types consist of: - count - a block length - a common stride between the blocks Stride: Distance between neighbours

Answer 61

MPI_Type_vector(n_elements, blocklength, stride, type, &new_type)

Answer 62

MPI_Type_create_subarray( int ndims, const int array_of_sizes, const int array_of_subsizes, const int array_of_Starts, int order, MPI_Datatype old, MPI_Datatype new ) ndims: dimensions in array array_of_sizes: how big is entire array array_of_subsizes: How big is our slice of the array array_of_starts: where is the origin of the slice

Answer 63

MPI_Type_create_subarray( int 2, {6, 6}, {4, 4}, {1,1}, int order, ? MPI_Datatype old, ? MPI_Datatype new ? )

Answer 64

MPI_Type_contiguous( count, oldtype, newtype )

Answer 65

Like MPI_Type struct, exceptthat all struct members have the same type

Answer 66

An arbitrary set of ranks

Answer 67

MPI_Comm_group(MPI_Comm comm, MPI_Group *group)

Answer 68

Create a subgroup from a group. Include n_members of the ranks in the rank_list MPI_Group_incl( MPI_Group old, int n_members const int rank_list[], &new_group )

Answer 69

MPI_Group_excl( MPI_Group old, int n_to_remove const int rank_list[], &new_group ) removes ranks from a group

Answer 70

MPI_Group_union MPI_Group_intersection

Answer 71

They can be made into communicators MPI_Comm_create( MPI_Comm old_comm, MPI_Group g, MPI_Comm *new_comm ) When a rank within the group calls this, the communicator handle is returned A rank outside the group gets returned MPI_COMM_NULL

Answer 72

Create a graph communicator out of another comm MPI_Graph_create( old_comm, int n_node, int indexes[], int edges[], int reorder, &new_comm ) reorder: Can MPI give new ranks in the new comm indexes: Used to map ranks to nodes, gives the start of each ranks neighbour list by its entry in the index list edges: Sorted list of neighbour ranks of each rank

Answer 73

MPI_Cart_create( old_comm, n_dims, period[], (wrap the edges?) reorder, (new rank in comm) &new_comm )

Answer 74

In cartesian comms we want the dimensions to be as close to the square (2D), or cube (3D), and so on MPI_Dims_create( n_nodes, (rank count) n_dims, int dims[], (result array) )

Answer 75

MPI_Cart_coords( cart_comm, rank, (current rank) dims, (dims in comm) coords[] (result array) )

Answer 76

{y, x} y: elements starting at top, and going down x: starting left, going right

Answer 77

MPI_Cart_shift( comm, dir, (axis to shift) displacements, (how far to shift) *rank_src, *rank_dest ) Rank_src: when shifted, what rank is now in my place Rank_dest: when shifted, in what place am I

Answer 78

on non-periodic comms, if ranks have neighbours off-grid, these will return ass MPI_PROC_NULL When this is fed to a comm call, where it would expect a rank, the matching operation would not be carried out

Answer 79

All ranks can open file at the same time Each rank sets a view of the file, this is the window where they can write All ranks can write within their own views at the same time

Answer 80

MPI_Open( comm, *filename, (string) int access_mode, (MPI flags) MPI_Info info, (can be NULL) MPI_File *fh (the open file handle) ) MPI_File_close(MPI_File *fh)

Answer 81

MPI_MODE_CREATE: create if not exist MPI_MODE_WRONLY MPI_MODE_RDONLY MPI_MODE_RDWR MPI_MODE_APPEND: signals that will be adding data at the end

Answer 82

Allows you to specify position for each data chunk to write MPI_File_write_at( MPI_File fh, MPI_Offset offset (where to write in it, different on each rank) *buf (data to write) count, type, *status )

Answer 83

Restrict area a rank will write in to shape it like an MPI_Datatype MPI_File_set_view( *fh, *displacement type, (type to read/write) MPI_Datatype file_layout (what region of file to acces) *representation, MPI_Info info )

Chapter 3 - MPI Flashcards

(107 cards)