Scaling Flashcards

1
Q

What is speedup?

A

A ratio of the time it takes to run a program without parallelism versus the time it runs in parallel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is scalability?

A

A measure of how much speedup the program gets as one adds more processors/cores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you calculate speedup?

A

• 𝑡1: Time on core
• Could be either sequential implementation or
p=1 version of parallel code
• 𝑡𝑝: Time on p cores

• Speedup, 𝑠𝑝 = 𝑡1/𝑡𝑝

• Ideal speedup is when 𝑠𝑝 = 𝑝.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you calculate parallel efficiency?

A

• Parallel efficiency 𝑒𝑝 =𝑠𝑝/𝑝 (usually expressed as a percentage)

Where sp is speedup and p is number of cores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you create a strong scaling graph?

A
  • Keep problem size fixed
  • Measures what is the fastest we can run a
    given problem
  • Report time to solution for different number of cores/nodes
  • Ideal: Linear reduction in runtime
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you create a weak scaling graph?

A
  • Problem size increases with the number of cores/nodes
  • Measures how time to solution changes with problem size but increasing compute resources at the same time.
  • Report time to solution
  • Ideal: Constant runtime
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Amdahl’s Law?

A

• A computer program will never go faster than the sum of the parts that do not run in parallel (the serial portions of the program), no matter how many processing elements we have.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you use Amdahl’s law to calculate maximum possible speedup?

A

• 𝛼: Fraction of the original program that is inherently serial (in terms of runtime)

• 𝑡𝑝 = 𝛼 ∗ 𝑡1 + ((1-a) * t1)/p

(where 𝑡1 is the runtime on a single core/processor and 𝑡𝑝 is the runtime on 𝑝 processors)

Sp = t1/tp

(Refer to examples)

Sp = 1 / (a + (1-a)/p)

Therefore
Maximum possible speedup is 1 / a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the problem with Amdahl’s law?

A

• Amdahl’s law was a bit pessimistic, talking about the inevitability of the serial part of every program starting to dominate as you scale to more cores/processors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Gustafson’s law?

A

• Speedup = 𝛼 + 𝑝 (1 − 𝑎) = 𝑝 − 𝑎 (𝑝 − 1)

P is the number of processors
𝛼: Fraction of the original program that is inherently serial (in terms of runtime)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is Gustafson’s law optimistic?

A

Gustafson’s law offers an optimistic view to this by allowing the problem size to increase as we add more cores/processors, to solve a larger problem in the same time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why is Gustafson’s law more realistic than Amdahl’s law?

A

This is more realistic because bigger computers are used to solve bigger problems in the amount of time the user is willing to wait for a solution – constant runtime.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What kinds of scaling does Amdahl’s law describe?

A

Strong scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What kind of scaling does Gustafson’s law describe?

A

Weak scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why might time spent in the serial portion of code not be constant?

A

Communication

• e.g. we want to calculate the total of something across many processes
• Every process could send its subtotal to the master node (scales linearly with number of
nodes/processes)
• Every node could calculate the sum of itself and its neighbour, which then sums it up with its own neighbour and so on(scales with O(log n))
• Definitely not constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is domain decomposition?

A
  • Used for data parallelism
  • Split the data, do same set of operations
  • So equally splitting data to each core should mean no load imbalance