Lecture 9 Flashcards
GPU?
Graphics Processing Unit
Very fast processors to perform the same computation (shaders) on collections of data (vertices, pixels).
Uses data parallelism (with origins in SIMD).
What’s the top level software view?
At the top level, we have a master process which runs on the CPU and performs the following steps:
1. Initialises compute device.
2. Defines problem domain.
3. Allocates memory in host and on device.
4. Copies data from host to device memory.
5. Launches execution on “kernel” on device. (This is when GPU specific code gets executed).
6. Copies data from device memory to host.
7. Repeats 4-6 as needed.
8. De-allocates all memory and terminates.
What is heterogeneity?
The existence of diverse components or resources within a system.
A modern platform includes:
- one or more CPUs
- one or more GPUs
- optional accelerators
OpenCL?
Open Computing Language is a framework for parallel programming that enables software developers to write code that can be executed across various heterogeneous computing devices like GPUs.
The OpenCL view:
The host code sends commands to the devices to:
- transfer data between host memory and device memories.
- to execute device code.
The OpenCL execution model:
Application runs on a host which submits work to devices
- work-item: the basic unit of work on an OpenCL device.
- kernel: the code for a work-item (basically a C function).
- program: collection of kernels and other functions (analogous to a dynamic library).
Discuss device, kernel, program, command queue, and context in terms of the OpenCL card analogy.
Device: OpenCL devices correspond to the players. Just as a player receives cards from the dealer, a device receives kernels from the host. In code, a device is represented by cl_device_id.
Kernel: OpenCL kernels correspond to the cards. A host application distributes kernels to devices in much the same way a dealer distributes cards to players. In code, a kernel is represented by a cl_kernel.
Program: An OpenCL program is like a deck of cards. In the same way that a dealer selects cards from a deck, the host selects kernels from a program. In code, a program is represented by a cl_program.
Command queue: An OpenCL command queue is like a player’s hand. Each player receives cards as part of a hand, and each devices receives kernels through a command queue. In code, a command queue is represented a cl_command_queue.
Context: OpenCL context corresponds to card tables. Just as a card table makes it possible for players to transfer cards to one another, an OpenCL context allows devices to receives kernels and transfer data. In code, a context is represented by a cl_context.