Exam Preparation Questions Flashcards

NB. The exam questions will be more elaborated.

1
Q

Flynn’s Taxonomy classifies computer architectures as:
A) Single/multipleCPUs anddistributed/shared memory
B) Single/multipleinstruction andsingle/multipledata
C) CPUs/GPUs anddistributed/shared memory

A

B) Single/multipleinstruction andsingle/multipledata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which system requires parallel computing with message passing to write computer programs?
A) A multicorelaptop
B) A NUMA computer node
C) A multi-node computer cluster

A

C) A multi-node computer cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the bisection width of this network? (10 connections)

A) 2
B) 4
C) 6

A

B) 4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the advantage of a binary tree topology over a mesh topology?
A) Binary tree has a better diameter
B) Binary treehas more edges per node
C) Binary tree has a better bisection width

A

A) Binary tree has a better diameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The communication complexity is …

using Distributed
addprocs(4)
N = 1600
ftr = @spawnat 2 sum(A)
s = fetch(ftr)

A) O(N)
B) O(N^2)
C) O(N^3)

A

B) O(N^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which MPI directive would you use to send a copy of a 2D matrix to all processes from one process?

A) MPI_Bcast
B) MPI_Scatter
C) MPI_Allreduce

A

A) MPI.Bcast

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which one would you use to send a large message to avoid duplicating data?
A) MPI_Bsend
B) MPI_Ssend
C) MPI_Send

A

B) MPI_Ssend

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Each process computes a set of rows of C = A * B. What is the communication complexity in each worker?
A) O(1)
B) O(N^3/P)
C) O(N^2)

A

C) O(N^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When parallelising matrix multiplication: C = A * B, which partition leads to the best computation/communication ratio?

A) Each processcomputes one row of C
B) Each process computes onecolumn of C
C) Each process computes a set of rows of C

A

C) Each process computes a set of rows of C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can we use latency hiding in the Jacobi method?
A) No, because allcomputationsdepend on thepreviousiteration
B) Yes, because computation of interior value scan overlap communication
C) No, because all values depend on neighbouring values

A

B) Yes, because computation of interior value scan overlap communication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which of the following suggestions does NOT solve the incorrect behaviour of the parallel Floyd algorithm?
A) Communicate row k using standard sends
B) Communicate row k using synchronous sends
C) Use MPI.Barrier in each iteration

A

A) Communicate row k using standard sends

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which are the data dependencies for each worker in Gaussian elimination at iteration k>1?
A) Workers withrows > kneed theentire row k
B) Workers with rows > k need all rows>= k
C) Workers with rows > k need only part of row k

A

C) Workers with rows > k need only part of row k

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What can be done to solve the load imbalance of block partitioning in Gaussian elimination?
A) Use a cyclic distribution of rows
B) Use a replicated workers model
C) Use a 2D block partition

A

A) Use a cyclic distribution of rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In parallel TSP, there may be load imbalance because
A) It is impossible todivide the searchtree evenly
B) We don’t know before hand how long the paths in the search tree are
C) Some workers might be able to do more pruning than others

A

C) Some workers might be able to do more pruning than others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to cheat with speedups?
A) Use a fast compiler and a slow network
B) Use a slow compiler and a fast network
C) Use very small problem sizes

A

B) Use a slow compiler and a fast network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which law gives the more pessimistic optimal speedup for a large parallel computation?
A) Amdahl’s law
B) Gustafon’slaw
C) Amdahl’s and Gustafson’slaws are equivalent

A

A) Amdahl’s law

17
Q

Which elements of v are needed in CPU 2 to compute this sparse matrix-vector product?

A) The entirevector v
B) v[5] and v[6]
C) v[2] and v[6]

Where:
CPU 1: v1, v2, v3
CPU 2: v4, v5, v6

A

C) v[2] and v[6]

18
Q

Which strategy is used to improve load balance while reducing communication in the POP simulation software?

A) Cyclic partitions
B) Hierarchical block distribution
C) Computation ensembles

A

B) Hierarchical block distribution