11 - Performance & Optimisation Flashcards by Andrew Smith

What is implicit with the goal of parallelism?

Optimisation

How well did you know this?

Not at all

Perfectly

What are 2 ways to decide which parallelisation strategy is better?

Use theoretical measures

Measure the performance and compare

How well did you know this?

Not at all

Perfectly

What is theoretical performance?

Span/step and work complexity

How well did you know this?

Not at all

Perfectly

What is the most critical part when parallelising code?

The theoretical performance.

It can give you huge speedup gains

How well did you know this?

Not at all

Perfectly

What is latency?

The time it takes to complete a single task

How well did you know this?

Not at all

Perfectly

What is throughput?

The rate at which tasks can be complete

How well did you know this?

Not at all

Perfectly

What is better, higher or lower latency?

Lower

How well did you know this?

Not at all

Perfectly

What is better, higher or lower throughput?

Higher

How well did you know this?

Not at all

Perfectly

What does latency minimise?

Time at the expense of power

How well did you know this?

Not at all

Perfectly

What does throughput minimise?

Quantity of tasks processed per unit of time

How well did you know this?

Not at all

Perfectly

What is optimised for low latency computations, CPU or GPU?

CPU

How well did you know this?

Not at all

Perfectly

What is optimised for data-parallel and high throughput computations?

GPU

How well did you know this?

Not at all

Perfectly

What has the larger cache? CPU or GPU?

CPU

How well did you know this?

Not at all

Perfectly

What is speedup?

Compares the time T for solving the identical problem on one processor versus on p processors

How well did you know this?

Not at all

Perfectly

What is the ideal speedup?

Linear

How well did you know this?

Not at all

Perfectly

What is the formula for speedup?

Study These Flashcards

S = Ts / Tp

What is efficiency?

Study These Flashcards

Measures the utilisation of hardware resources (return on hardware investment)

What are 2 parallelisation metrics?

Study These Flashcards

Speedup and efficiency

What is the ideal efficiency?

Study These Flashcards

1 (or 100%)

What are 3 sources of performance loss?

Study These Flashcards

Non-parallelisable computation (always small part that is serial)

Overhead (extra effort for communication between processors)

Under-utilisation (idle processors, slow memory)

What is Amdahl’s Law?

Study These Flashcards

The improvement to be gained from using a faster mode of execution is limited by the fraction of time that this mode is used

What is the formula for Amdahl’s law?

Study These Flashcards

s = 1/(1-f) + f/p

What does Amdahl’s Law provide?

Study These Flashcards

A theoretical upper limit on parallel speedup assuming that there are no costs for parallelism

What overheads can be an extra cost of parallelisation?

Study These Flashcards

Communication (shared memory)

Synchronisation (waiting for work to complete, barriers)

Computation (assignment of tasks to processors)

Memory requirements (parallel algorithms typical require more memory)

What is under-utilisation?

Each hardware will have its maximum capabilities Not utilising hardware resources to their full potential decreases efficiency

What is load imbalance?

Uneven distribution of work to processors

What is power consumption proportional to?

The cube of processor frequency f

What are 4 strategies to gain the best performance and optimisation?

Selection of the right algorithm Following basic principles for writing efficient code Architecture specific optimisation Micro-level optimisation

What is (1-f), f and p in Amdahl's law?

``` (1-f) = fraction of serial code (0-1) f = fraction of parallel code (0-1) p = number of processors ```

11 - Performance & Optimisation Flashcards

(29 cards)