Compiler ILP Flashcards
How can the compiler help us improve IPC?
- Eliminating “dependence chains”, strings of instructions with RAW dependencies
- Getting around the limited window that the hardware has. Hardware can only see a limited number of instructions at a time, so may not recognize there are other instructions that could execute in parallel. The compiler can help by moving these instructions closer to each other.
Tree Height Reduction
This improves the performance of associative operations (like addition). Say we have a sum of four values being stored in R8:
R8 = R1 + R2 + R3 + R4
The instructions might be…
R8 = R1 + R2
R8 = R8 + R3
R8 = R8 + R4, which creates a dependency chain.
The compiler can reorganize this like so…
R8 = (R1 + R2) + (R3 + R4), and as instructions
R5 = R1 + R2
R6 = R3 + R4
R8 = R5 + R6
which can be executed in two cycles instead of three
Instruction Scheduling
Finding instructions that can fill in stalls (and maybe adjusting offsets to compensate for these changes) in order to eliminate stalls
How does If-Conversion help Scheduling?
Without If-Conversion, you can only move things around within the code before the branch, after the branch, or within either branch itself. With If-Conversion, you can actually move code around anywhere within all of those blocks.
Loop Unrolling
When you do more than one iteration of a loop at a time. Unrolling once means you do two iterations (one extra), unrolling twice would be three, etc.
This helps because now the compiler can rearrange instructions within the unrolled loop (two or three or however many iterations at a time).
Eliminates “looping overhead”
Downside of Loop Unrolling
- Code bloat
- What if num of iterations is unknown (like a while loop)?
- What if num of iterations (known, like a for loop) isn’t a multiple of the iterations per unrolled loop?
There are ways around these problems, but too advanced for this course
How does function call inlining 1) reduce the number of instructions and 2) improve CPI?
This eliminates the overheads of function calls, like
- Function call itself
- Function return
- Preparing parameters
This reduces the number of instructions!
Also helps scheduling work better, in the same way as loop unrolling.
This improves CPI!
Do you get a greater benefit from using function call inlining on small or large functions?
Small!
What is the downside of function call inlining? How do we minimize this?
- Like loop unrolling, code bloat
Minimize by inlining only small functions