Parallel Architectures Flashcards
What are three classifications of parallel computing?
Low-level, single instruction multiple data (SIMD) and multiple instructions multiple data
Define low-level parallelism
It has an abacus serial adding algorithm and parallels adding algorithm, it’s essentially a calculation
Define single instruction multiple data
Represents multiple data items at once and operate on these arrays with single instructions such as ‘‘VECTOR-ADD’’
What did complex CISC SIMD change compared to SIMD?
Added specialised word length vector registers in CPU which loads them up with data
How does SIMD operate inside the GPU?
DMA transfers to main RAM, smaller caches, small pipelines of two stages and thousands of ALUs, ultimately it’s optimised for data-parallel and high throughput computations
How does SIMD operate inside the CPU?
Optimised for low-latency, large caches which gives quick access to data, complex pipelines of total 30 stages but fewer ALUs
What’s a work-group?
Group of Pes within a CU all executing the same instruction, single instruction
What’s a work item?
Processing element executing it on one datum, multiple data
What’s a kernel?
A program of SIMD instructions given to a work-group
How does the GPU implement itself?
Through OpenCL C, kernel code is compiled into GPU’s own assembly language and machine code
What exactly is MIMD?
Related to vectors idea of architecture a.k.a. ‘‘Very Long Instructions Words’’ = VLIW
Do all CPU’s have the same address space?
Yes
What’s a cache coherency?
Some levels are shared, others not, tricky to get cache write right e.g. L1 cache needs to notify each other on changes and refreshes
What does NUMA (non-uniform memory access) do?
- Single address space shared by processors
- But access time differ
- Used in supercomputers
What’s a ‘blue gene’ supercomputer?
It’s based on custom chips, it’s task is to protein folding problems and it has a very hierarchical architecture