CUDA Flashcards

Question 1

Q

What is heterogeneous computing?

Answer

A

Computing in both GPU and CPU

Question 2

Q

What is the Host?

Answer

A

CPU and its memory

Question 3

Q

What is the Device?

Answer

A

GPU and its memory

Question 4

Q

How do we declare a kernel function that is to be run on the Device?

Answer

A

__global__

Question 5

Q

What does the keyword __global__ indicate?

Answer

A

A function that runs on the device and is called from host code.

Question 6

Q

What does it mean when we describe launching a kernel?

Answer

A

The CPU places the kernel into the GPGPU stream. Execution in the CPU continues without waiting for the kernel execution to complete.

Question 7

Q

How does the CPU launch a kernel on the device?

Answer

A

Triple angle brackets mark a call from the host to the device.

Question 8

Q

Device Pointers

Answer

A

Points to GPU memory.
May to passed to/from host code.
May not be dereferenced in host code.

Question 9

Q

Host Pointers

Answer

A

Point to CPU memory.
May be passed to/from device code.
May not be dereferenced in device code.

Question 10

Q

What is used to allocate memory on the GPGPU device?

Answer

A

cudaMalloc()

Question 11

Q

What is used to copy memory from the CPU Host to the GPGPU?

Answer

A

cudaMemcpy() with the cudamemcpyHostToDevice.

Question 12

Q

What is used to copy memory from the GPGPU to the CPU?

Answer

A

cudaMemcpy() with the cudamemcpyDeviceToHost.

Question 13

Q

Where are Device pointers stored?

Answer

A

Stored on the Host and passed to the kernels when they execute.

Question 14

Q

What is a thread block?

Answer

A

A group of threads executed together.

Question 15

Q

How are threads in different thread blocks synchronized?

Answer

A

Call another kernel.

Question 16

Q

How can threads within a thread block synchronize?

Answer

Study These Flashcards

A

__syncthreads();

Question 17

Q

What threads can access the memory that is declared using the __shared__ keyword?

Answer

Study These Flashcards

A

Threads in the same block.

Question 18

Q

What purpose does the variable blockIdx.x have in CUDA?

Answer

Study These Flashcards

A

Access block index.

Question 19

Q

What purpose does the variable threadIdx.x have in CUDA?

Answer

Study These Flashcards

A

Access thread index within block.

Question 20

Q

What purpose does the variable blockDim.x have in CUDA?

Answer

Study These Flashcards

A

Get threads per block

Question 21

Q

For a kernel launch two parameter are given inside of the «<»>, what is the prupse of these parameters?

Answer

Study These Flashcards

A

«<nblocks, nThread per block»>

Question 22

Q

What does __syncthreads() do? If a GPGPU executes in a SIMD mode, why do we need this call?

Answer

Study These Flashcards

A

Used as a barrier to prevent data hazards.

Threads are divided into small groups of threads called warps. While the warps execute in a SIMD manner and are implicitly synchronized, the threads in a thread block may not remain synchronized.

CUDA Flashcards

(22 cards)