SIMD Flashcards

1
Q

what is simd

A

more unit instrutions next to each other to do same op to different data

transistor budget
less power
and twice performance ideally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the vector sizes in simd

A

SSE 128
AVX 256
AVX 512

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

intrinsic functions

A

data types: __m256 (8x32), __m256d (4x64), __m256i (any)

vector length: _mm512, _mm256
function: add, mul, sub, load, store
type and precision: pd (packed double), ps (packed single), ss (scalar single)

loadu: unaligned
load: aligned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

vec efficiency

A

VE = N/vl

N trip count (how many times youre doing the vectorized loop)
vl vector lenght

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

simd intrinsics cons

A

low level
error prone

thats why we use compiler generated simd code!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

when does autovectorisation happen and when does it fail

A
  • if possible
  • if beneficial

fails for:
- data dependency
- alignment
- mixed data types
- function calls in loop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

can we parallelise or vectorzie a loop with loop carried dependency

A

parallelise no
vectorize yes: if the distance is greater than the vector size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

openmp simd

A

pragma omp simd

maybe a for loop right after if #pragma omp for simd

  • can have private, firstprivate and rediction
  • safelen(l) safe length to vectorise (max vec length)
  • linear(list[:linear step])
    aligned(list[:alignment])

for loops: simd chunks
schedule(simd:static, 5) chooses roughly 5 chunks but favours alignment, no remained loops

-remainder: end
-peel loop: beginning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly