Parallel Performance Flashcards

Question 1

Q

What are the main causes of performance degradation in parallel computing?

Answer

A

Starvation – Not enough parallel work to keep processors busy
Latency – Delay in transferring data between system components
Overhead – Extra overhead work needed for parallel execution (e.g., thread management)
Waiting – Processes waiting for shared resources

Acronym: SLOW

Question 2

Q

What is parallel speed-up? How is it defined mathematically?

Answer

A

How much faster the parallel program is compared to the serial version.
S{N} = T{N}/T{0}, where T{0} is serial execution time and T{N} is parallel execution time on N processors.

Question 3

Q

What is parallel efficiency? How is defined mathematically?

Answer

A

Whether speed-up represents efficient use of the resources.
E{N} = S{N}/N, where S{N} is speed-up on N processors.

Question 4

Q

What is strong scaling?

Answer

A

Keeping the total problem size fixed and increasing the number of processors to reduce execution time.

Question 5

Q

What is Amdahl’s Law? What is the formula?

Answer

A

The parallel speed-up is limited by the fraction of the program that cannot be parallelised.

Formula:
S{N} = 1/(s + p/N), where:
- S{N} is the parallel speed-up
- s is the serial fraction
- p is the parallel fraction
- N is the number of processors

Question 6

Q

What is the theoretical maximum speed-up if infinite processors were available?

Answer

A

S{max} = 1/s = 1/(1−p)

Question 7

Q

How does Gustafson’s Law challenge Amdahl’s Law?

Answer

A

It assumes that problem size increases with the number of processors, leading to better scaling than predicted by Amdahl’s Law.

Question 8

Q

What is Gustafson’s Law? What is the formula?

Answer

A

With a more powerful processor, the problem generally expands

Formula
S{N} = s + pN, where:
- S{N} is the parallel speed-up
- s is the serial fraction
- p is the parallel fraction
- N is the number of processors

Question 9

Q

What is weak scaling?

Answer

A

Keep the workload per processor constant
Increase the number of processors
This leads to increased total problem size

Question 10

Q

Why are barriers used in parallel programming?

Answer

A

To ensure all threads complete their work before proceeding to a critical section (e.g., writing output).

Question 11

Q

How can implied barriers be removed to improve performance?

Answer

A

Using the nowait clause

Question 12

Q

What is load balancing?

Answer

A

Distributing workloads across processes to prevent idle time and make better use of resources

Question 13

Q

What are the three main loop scheduling strategies in OpenMP?

Answer

A

(static, chunk) – Assigns equal chunks to threads in a round-robin manner.
(dynamic, chunk) – Assigns chunks dynamically as threads finish their work.
(guided, chunk) – Initially large chunks that decrease in size.

Question 14

Q

Why is load balancing important in MPI programs?

Answer

A

To ensure all processors are utilised effectively and prevent idle time due to uneven work distribution.

Question 15

Q

What is the role of an interconnect in HPC clusters?

Answer

A

It facilitates communication between compute nodes by carrying MPI messages

Question 16

Q

What are common interconnect technologies used in HPC clusters?

Answer

Study These Flashcards

A

Gigabit Ethernet
Infiniband

Question 17

Q

What are three key factors affecting MPI message transmission time?

Answer

Study These Flashcards

A

Number of hops between nodes
Blocking factor of the network
Other network traffic

Question 18

Q

How is message transmission time modelled/calculated?

Answer

Study These Flashcards

A

t = L+ M/B, where:
- L is the latency
- M is the message size
- B is the bandwidth

Question 19

Q

When is latency more important than bandwidth?
When is bandwidth more important than latency?

Answer

Study These Flashcards

A

When sending many small messages.
When sending large messages.

Question 20

Q

What is the proportion of computation to communication for domain decomposition?

Answer

Study These Flashcards

A

R{2D} = 4/N
R{3D} = 6/N

As subdomain size decreases, communication overhead increases.

Question 21

Q

What are parallel overheads?

Answer

Study These Flashcards

A

Extra work required to run a program in parallel
e.g., Extra code for message passing, process synchronisation, thread management, loop scheduling, etc.

Question 22

Q

How does Amdahl’s Law change when considering parallel overheads?

Answer

Study These Flashcards

A

The speed-up formula includes the extra overhead term:
- S{N} = 1/(s + p/N + (n{p}v)/T{0})
where n{v}v represents parallel overhead as a fraction of serial runtime.

Question 23

Q

What is super-linear speed-up, and when does it occur?

Answer

Study These Flashcards

A

When speed-up exceeds N due to improved cache utilisation at smaller problem sizes.

Parallel Performance Flashcards

(23 cards)