Exam Preparation Questions Flashcards

Question 1

Q

Flynn’s Taxonomy classifies computer architectures as:
A) Single/multipleCPUs anddistributed/shared memory
B) Single/multipleinstruction andsingle/multipledata
C) CPUs/GPUs anddistributed/shared memory

Answer

A

B) Single/multipleinstruction andsingle/multipledata

Question 2

Q

Which system requires parallel computing with message passing to write computer programs?
A) A multicorelaptop
B) A NUMA computer node
C) A multi-node computer cluster

Answer

A

C) A multi-node computer cluster

Question 3

Q

What is the bisection width of this network? (10 connections)

A) 2
B) 4
C) 6

Question 4

Q

What is the advantage of a binary tree topology over a mesh topology?
A) Binary tree has a better diameter
B) Binary treehas more edges per node
C) Binary tree has a better bisection width

Answer

A

A) Binary tree has a better diameter

Question 5

Q

The communication complexity is …

using Distributed
addprocs(4)
N = 1600
ftr = @spawnat 2 sum(A)
s = fetch(ftr)

A) O(N)
B) O(N^2)
C) O(N^3)

Answer

A

B) O(N^2)

Question 6

Q

Which MPI directive would you use to send a copy of a 2D matrix to all processes from one process?

A) MPI_Bcast
B) MPI_Scatter
C) MPI_Allreduce

Answer

A

A) MPI.Bcast

Question 7

Q

Which one would you use to send a large message to avoid duplicating data?
A) MPI_Bsend
B) MPI_Ssend
C) MPI_Send

Answer

A

B) MPI_Ssend

Question 8

Q

Each process computes a set of rows of C = A * B. What is the communication complexity in each worker?
A) O(1)
B) O(N^3/P)
C) O(N^2)

Answer

A

C) O(N^2)

Question 9

Q

When parallelising matrix multiplication: C = A * B, which partition leads to the best computation/communication ratio?

A) Each processcomputes one row of C
B) Each process computes onecolumn of C
C) Each process computes a set of rows of C

Answer

A

C) Each process computes a set of rows of C

Question 10

Q

Can we use latency hiding in the Jacobi method?
A) No, because allcomputationsdepend on thepreviousiteration
B) Yes, because computation of interior value scan overlap communication
C) No, because all values depend on neighbouring values

Answer

A

B) Yes, because computation of interior value scan overlap communication

Question 11

Q

Which of the following suggestions does NOT solve the incorrect behaviour of the parallel Floyd algorithm?
A) Communicate row k using standard sends
B) Communicate row k using synchronous sends
C) Use MPI.Barrier in each iteration

Answer

A

A) Communicate row k using standard sends

Question 12

Q

Which are the data dependencies for each worker in Gaussian elimination at iteration k>1?
A) Workers withrows > kneed theentire row k
B) Workers with rows > k need all rows>= k
C) Workers with rows > k need only part of row k

Answer

A

C) Workers with rows > k need only part of row k

Question 13

Q

What can be done to solve the load imbalance of block partitioning in Gaussian elimination?
A) Use a cyclic distribution of rows
B) Use a replicated workers model
C) Use a 2D block partition

Answer

A

A) Use a cyclic distribution of rows

Question 14

Q

In parallel TSP, there may be load imbalance because
A) It is impossible todivide the searchtree evenly
B) We don’t know before hand how long the paths in the search tree are
C) Some workers might be able to do more pruning than others

Answer

A

C) Some workers might be able to do more pruning than others

Question 15

Q

How to cheat with speedups?
A) Use a fast compiler and a slow network
B) Use a slow compiler and a fast network
C) Use very small problem sizes

Answer

A

B) Use a slow compiler and a fast network

Question 16

Q

Which law gives the more pessimistic optimal speedup for a large parallel computation?
A) Amdahl’s law
B) Gustafon’slaw
C) Amdahl’s and Gustafson’slaws are equivalent