Wk 9 : SIMD Flashcards

Question 1

Q

Why do I care about SIMD?

Answer

A

It’s becoming the new standard. There’s only so much you can do to maximize the ability to do things superscalar and we’ve pretty much hit that limit in terms of speed.

Question 2

Q

What’s the biggest challenge to overcome that arises with SIMD?

Answer

A

Lots and lots of data streams to memory

Question 3

Q

Where’s SIMD best?

Answer

A

Dot products, matrix multiply, dealing with arrays, especially big arrays.

Question 4

Q

How does Amdahl pitch-in when it comes to SIMD?

Answer

A

It can only see improvements based on the fraction of the program that I improve. Not all parts of and not every program allow for speed up via SIMD

Question 5

Q

How much can an SIMD register hold? How many of them are there?

Answer

A

4 sgl precision floats (32 bit, 4 byte each) or 2 double precision floats (64 bit, 8 byte each)….there are 16 SIMD registers

Question 6

Q

differentiate names of SIMD assy instructions

Answer

A

addss (scalar sgl prec)

addps (packed sgl prec)

Question 7

Q

What are the four options available to take advantage of SIMD?

Answer

A

Write directly in assembly
Use C libraries created for it
Use C intrinsics
Compiler vectorization options

Question 8

Q

Why is memory alignment so important in SIMD? On what value should it be aligned? Plus the assembly code for moving aligned variables is so much faster

Answer

A

Want to get all of the data in one read from memory and store in one write. If we start adding cycles to load/store, we lose the gains we were looking for by doing everything in parallel

Align on 128 bits (16 byte)

Question 9

Q

What should memory bus size be to make thi SIMD worth our while? Why?

Answer

A

128 bits. If takes 4 cycles to get from memory that doesn’t help us.

Question 10

Q

What does using intrinsics buy us?

Answer

A

Power of assembly like commands without getting bogged into details of individual registers, etc.

Question 11

Q

How are MSB / LSB represented?

Answer

A

Backwards

LSB -> MSB (left part affected via scalar)

Question 12

Q

What are the 3 main issues of SIMD to footstomp?

Answer

A

Must use memory alignment
Must explicitly say when load / store
Must handle overhead of shuffles.

Question 13

Q

Why is explicitly specifying loads/stores so important with SIMD?

Answer

A

If my code isn’t efficient with regard to memory accesses, I could be using SIMD and end up doing more loads and stores than necessary and negating the benefits of SIMD in the first place

Question 14

Q

How did fast math make a difference on the homework?

Answer

A

y = y + a[i]

with fast math, it indexes by 4. Without fast math, it still uses 1. This is because compiler is taking more liberty because its told its allowed to via fast math and its not as worried about aliasing issues.