CUDA Flashcards

1
Q

What is heterogeneous computing?

A

Computing in both GPU and CPU

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the Host?

A

CPU and its memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Device?

A

GPU and its memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we declare a kernel function that is to be run on the Device?

A

__global__

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the keyword __global__ indicate?

A

A function that runs on the device and is called from host code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does it mean when we describe launching a kernel?

A

The CPU places the kernel into the GPGPU stream. Execution in the CPU continues without waiting for the kernel execution to complete.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does the CPU launch a kernel on the device?

A

Triple angle brackets mark a call from the host to the device.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Device Pointers

A

Points to GPU memory.
May to passed to/from host code.
May not be dereferenced in host code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Host Pointers

A

Point to CPU memory.
May be passed to/from device code.
May not be dereferenced in device code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is used to allocate memory on the GPGPU device?

A

cudaMalloc()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is used to copy memory from the CPU Host to the GPGPU?

A

cudaMemcpy() with the cudamemcpyHostToDevice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is used to copy memory from the GPGPU to the CPU?

A

cudaMemcpy() with the cudamemcpyDeviceToHost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where are Device pointers stored?

A

Stored on the Host and passed to the kernels when they execute.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a thread block?

A

A group of threads executed together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How are threads in different thread blocks synchronized?

A

Call another kernel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can threads within a thread block synchronize?

A

__syncthreads();

17
Q

What threads can access the memory that is declared using the __shared__ keyword?

A

Threads in the same block.

18
Q

What purpose does the variable blockIdx.x have in CUDA?

A

Access block index.

19
Q

What purpose does the variable threadIdx.x have in CUDA?

A

Access thread index within block.

20
Q

What purpose does the variable blockDim.x have in CUDA?

A

Get threads per block

21
Q

For a kernel launch two parameter are given inside of the «<»>, what is the prupse of these parameters?

A

«<nblocks, nThread per block»>

22
Q

What does __syncthreads() do? If a GPGPU executes in a SIMD mode, why do we need this call?

A

Used as a barrier to prevent data hazards.

Threads are divided into small groups of threads called warps. While the warps execute in a SIMD manner and are implicitly synchronized, the threads in a thread block may not remain synchronized.