Terms Flashcards

Question 1

Q

What model of parallelization does CUDA employ?

Answer

A

SIMT - Single Instruction, Multiple Thread

Question 2

Q

What does a fundamental computing unit consist of?

Answer

A

An ALU (Arithmetic Logic Unit), and an FPU floating point unit.

Question 3

Q

What is a fundamental computing unit called?

Question 4

Q

What is a group of fundamental computing units called?

Answer

A

A streaming multiprocessor (SM)

Question 5

Q

What is a subtask of a computing task called?

Question 6

Q

What are computing subtasks organized into?

Question 7

Q

What are groups of computing subtasks organized into?

Question 8

Q

How does the GPU “hide latency”

Answer

A

When waiting for data, each SM runs a different warp that is ready.

Question 9

Q

What is the essential software construct in CUDA called?

Question 10

Q

How does CUDA tell each thread which part of the computation to do, and how does this method relate to what would be done in serial code?

Answer

A

It assigns index variables to each thread, like a loop index in serial code.

Question 11

Q

How would you specify a kernel launch function with 10 blocks of 100 threads per block?

Answer

A

__global__ void myKernel«<10, 100»>(args)

Question 12

Q

How do you specify a function should be called from the host and executed on the device? What is t called when this is launched from the device instead of the host?

Answer

A

__global__, dynamic parallelism

Question 13

Q

How do you specify a function should be called from the host and executed on the host?

Question 14

Q

How do you specify a function should be called from the device?

Answer

A

__device__

Question 15

Q

How do you specify a function should be compiled so it can be called on the host and device?

Answer

A

__host__ __device__

Question 16

Q

What function allocates device memory?

Answer

A

cudaMalloc

Question 17

Q

What function copies memory from the host to the device?

Answer

A

cudaMemcpy

Question 18

Q

What function frees memory from the device?

Question 19

Q

What function synchronizes threads within a block?

Answer

A

__syncThreads

Question 20

Q

What functions synchronizes all threads in a grid?

Answer

A

cudaDeviceSynchronize

Question 21

Q

What type of operations prevent conflicts from multiple threads accessing a variable?

Answer

A

atomicAdd

Question 22

Q

What type indicates an amount of memory?

Question 23

Q

What type is used for GPU errors?

Answer

A

cudaError_t

Question 24

Q

What type signature means an unsigned vector with 3 components? What’s the alternative when specifying gird and block sizes?

Answer

A

uint3, dim3