Terms Flashcards

1
Q

What model of parallelization does CUDA employ?

A

SIMT - Single Instruction, Multiple Thread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does a fundamental computing unit consist of?

A

An ALU (Arithmetic Logic Unit), and an FPU floating point unit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a fundamental computing unit called?

A

A core

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a group of fundamental computing units called?

A

A streaming multiprocessor (SM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a subtask of a computing task called?

A

Thread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are computing subtasks organized into?

A

Blocks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are groups of computing subtasks organized into?

A

Warps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does the GPU “hide latency”

A

When waiting for data, each SM runs a different warp that is ready.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the essential software construct in CUDA called?

A

A kernel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does CUDA tell each thread which part of the computation to do, and how does this method relate to what would be done in serial code?

A

It assigns index variables to each thread, like a loop index in serial code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How would you specify a kernel launch function with 10 blocks of 100 threads per block?

A

__global__ void myKernel«<10, 100»>(args)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you specify a function should be called from the host and executed on the device? What is t called when this is launched from the device instead of the host?

A

__global__, dynamic parallelism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you specify a function should be called from the host and executed on the host?

A

__host__

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you specify a function should be called from the device?

A

__device__

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you specify a function should be compiled so it can be called on the host and device?

A

__host__ __device__

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What function allocates device memory?

A

cudaMalloc

17
Q

What function copies memory from the host to the device?

A

cudaMemcpy

18
Q

What function frees memory from the device?

A

cudaFree

19
Q

What function synchronizes threads within a block?

A

__syncThreads

20
Q

What functions synchronizes all threads in a grid?

A

cudaDeviceSynchronize

21
Q

What type of operations prevent conflicts from multiple threads accessing a variable?

A

atomicAdd

22
Q

What type indicates an amount of memory?

A

size_t

23
Q

What type is used for GPU errors?

A

cudaError_t

24
Q

What type signature means an unsigned vector with 3 components? What’s the alternative when specifying gird and block sizes?

A

uint3, dim3