Accelerators and GPUs Flashcards

1
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why do CPUs need to balance multiple factors in their design?

A

CPUs must deliver acceptable performance for a wide range of applications while balancing functionality, performance, energy efficiency, and cost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the purpose of an accelerator in computing?

A

An accelerator works alongside a CPU to provide increased performance for specific workloads, with different design tradeoffs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Give an example of an early accelerator used in CPUs.

A

x87 floating point co-processors were used to accelerate floating point operations before floating point units were integrated into the main processor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are GPUs originally designed for?

A

GPUs were specialised for image generation, requiring many matrix-vector operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why are GPUs highly parallel?

A

They contain a large number of floating point units and support a large number of processing threads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does GPU memory differ from CPU memory?

A

GPUs use a different type of memory that provides higher bandwidth than CPU memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What was a key hardware evolution in GPUs?

A

GPUs evolved from fixed-function rendering pipelines to programmable unified shaders and double-precision arithmetic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why are GPUs useful as accelerators in HPC?

A

They offer good floating point performance and high memory bandwidth, making them suitable for computationally expensive tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why must some operations still be handled by a CPU in a GPU-accelerated system?

A

The CPU is responsible for tasks such as running the operating system and handling input/output operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How are GPUs typically connected to CPUs?

A

Via the PCI Express (PCI-e) interface.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a drawback of PCI-e connectivity for GPUs?

A

It has relatively high latency, which can impact performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can the performance impact of PCI-e latency be mitigated?

A

By minimising the transfer of data between the host CPU and the GPU.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What technology provides better GPU connectivity than PCI-e? What makes it better?

A

NVIDIA’s NVLink technology, which offers a better data rate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is energy efficiency important in HPC system design?

A

Power consumption and cooling requirements significantly impact overall system performance and cost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the Green500 list?

A

A companion to the Top500 list that ranks HPC systems by energy efficiency (GFlops/Watt).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the exascale era in computing?

A

The era of building exascale supercomputers, which perform at least one exaflop (10^18 floating point operations per second).

18
Q

Name the first exascale supercomputer.

A

Frontier (AMD EPYC CPUs and AMD Radeon GPUs).

19
Q

Why is hardware diversity increasing in HPC?

A

Companies beyond Intel and NVIDIA, such as AMD and ARM, are entering the market with competitive CPUs and GPUs.

20
Q

Why is portability an important consideration in modern HPC?

A

The increasing variety of CPU and GPU architectures means software must be able to run across different hardware platforms.

21
Q

What is the NVIDIA Pascal architecture?

A

A GPU architecture designed for HPC applications.

22
Q

How many Streaming Multiprocessors (SMs) can a Pascal GPU have?

A

Up to 60 SMs, though some may be disabled due to manufacturing defects.

23
Q

What is a Graphics Processing Cluster (GPC)?

A

A block of 10 Streaming Multiprocessors, functioning like an independent GPU within a Pascal GPU.

24
Q

How much high-bandwidth memory does a Pascal GPU have?

25
What is the memory bandwidth of a Pascal GPU?
720GB/s.
26
What is the function of the "GigaThreadEngine" in Pascal GPUs?
It schedules threads and handles context switching.
27
How does a Pascal GPU organise workloads?
Workloads are divided into thread blocks (up to 1024 threads), which are further subdivided into warps (up to 32 threads).
28
Why must all threads in a warp execute the same instruction?
Warps follow a Single Instruction Multiple Data (SIMD) model.
29
Why must data movement be considered in GPU programming?
The CPU and GPU have separate memory, and transferring data between them can be slow.
30
Why should small tasks remain on the CPU instead of being offloaded to the GPU?
The overhead of transferring data to the GPU can outweigh the benefits of parallel execution for small tasks.
31
How do GPUs hide memory latency?
By overcommitting cores, allowing execution to continue while waiting for memory operations.
32
Why is branching inefficient in GPU programming?
GPUs are optimised for data-parallel workloads, and branch divergence within a warp can cause performance degradation.
33
What is Single Instruction Multiple Data (SIMD)?
A parallel processing model where each thread performs the same operation on different data items.
34
Why do GPUs not have the same branch prediction as CPUs?
Their architecture prioritises parallel execution over complex control flow handling.
35
Name some programming models used for GPU programming.
1. OpenMP 2. OpenCL 3. OpenACC 4. CUDA 5. HIP
36
Which GPU programming model is specific to NVIDIA?
CUDA
37
What is the `target` construct in OpenMP 4.0?
A directive that allows offloading computations to accelerators like GPUs.
38
What does the `teams distribute` directive in OpenMP do?
It distributes loop iterations across threads in a team on an accelerator.
39
What does the `map` clause do in OpenMP target regions?
It specifies which data should be transferred between CPU and GPU memory.
40
Why is OpenMP useful for GPU programming?
It allows existing C, C++, and Fortran code to be parallelised with minimal changes.