final Flashcards

Question 1

Q

Fixed-overhead encoding

Answer

A

[mask][instruction list]
Mask is 8 bits, where 1 represents active instruction and 0 represents a nop
[10100111][A][B][C][D][E] => [A][nop][B][nop][nop][C][D][E]
+simple, but (-) has higher overhead

Question 2

Q

Templated-based encoding

Answer

A

○ Have templates to represent common patterns
○ (+) Relatively low overhead
○ (+) no penalty for short instruction
○ (-) limited set of templates, so still need some nops
○ Example:
§ [temp0][A][B][C]
§ Temp0 = (0->0). (1->2). (2->5). Chain.
§ Expands to:
§ [A][nop][B][nop][nop][C]
§ “Map instruction 0 to location 0”

Question 3

Q

How does VLIW handle branching?

Answer

A

Two-Step Branching -
○ Separate comparison from branch instruction (comparison results are stored in a branch-condition register)
○ Between compare and branch, can fill with other instructions
○ Branch instruction requires 2 delays slots

Question 4

Q

how does VLIW handle predication?

Answer

A

Partial predication:
○ Execute all individual operations on both sides of the branch
○ Select the final based on a condition
○ Slct R3 = B1, R4, R3
If B1, then R3 = R4, otherwise R3 = R3

Question 5

Q

How does VLIW handle speculation?

Answer

A

○ Control speculation: move instruction out of branch condition
Data speculation: check if LD/ST are to the same address after loading and storing

Question 6

Q

Compiler front-end steps

Answer

A

Scanner - reads tokens
- Parser - gets tokens from scanner and checks syntax (tokens are in right place)
Semantic Analysis - does the sentence make sense? (“aaa” + 10)

Question 7

Q

register allocation steps

Answer

A

Web: Determine live ranges for each value
- Interference: Determine overlapping ranges => graph
- Spill cost (how many definitions/uses exist in that range): Compute the benefit of keeping each web in a register (vs memory)
- Allocation: Decide which webs get registers
- Spilling and splitting: split webs if needed
- Assignment: assign actual registers to webs
  Code generation

Question 8

Q

webs

Answer

A

○ All reachable uses and definitions are placed in the same web (i.e. if a def/use case for that variable is not reachable by the web, it is placed in a different web)

Question 9

Q

interference graph

Answer

A

○ Each node in the graph represents a variable/web
○ Each edge represents two variables that are live at the same time (i.e. A-B means that A and B are live at the same time)

Question 10

Q

graph coloring heuristic

Answer

A

Remove nodes that have degree < N and push onto a stack
For remaining nodes (degree >= N): repeatedly remove and push onto stack the node with the least spill cost
○ ==> result is a stack with the nodes ordered from highest to lowest spill cost to nodes with deg < N
○ This enables us to prioritize coloring nodes with the highest spill code first
To color:
- Pop a node off the top of the stack and color it differently than its neighbors that are already colored
- If no color available, node is assigned to memory

Question 11

Q

choosing which web to split

Answer

A

○ Has interference degree >= N

○ Has minimal spill cost

Question 12

Q

register coalescing

Answer

A

if r0 and r1 do not interfere, combine their webs and assign them one register
(-) might create a web that is not colorable

Question 13

Q

register targeting

Answer

A

○ Some variables need to be in specific registers at a certain time (args to function, return value, etc)
○ Pre-color those webs to eliminate copy instructions

Question 14

Q

Presplitting of webs

Answer

A

○ Some variables have large “dead” regions (i.e. regions between def and use where the variable is not used)
○ ==> Can presplit the web into two live ranges

Question 15

Q

Interprocedural register allocation

Answer

A

○ Store registers across procedural boundaries is expensive
○ Can customize procedural call by cloning the function to save loading and storing a register
§ Each clone function is only available for that call site

Question 16

Q

what is differential register allocation?

Answer

A

Instead of using instruction bits to id a register, use less bits to id the difference between the last and next regsister (i.e. r1, r2 => 1, r2, r5 => 3)

Question 17

Q

adjacency graph

Answer

A

used in differential register allocation. captures the order in which registers are accessed, An edge from r1 -> r2 with weight 2, means that r2 is accessed after r1 2 times

Question 18

Q

adjacency graph: Edge coverage

Answer

A

if an edge in the adjacency graph can be encoded with the given number of bits

Question 19

Q

adjacency graph: cost

Answer

A

the sums of the weights of the uncovered edges

Question 20

Q

differential remapping algorithm

Answer

A

○ Start with initial register allocation and compute cost
○ Iteratively swap a pair of registers and recompute the cost
§ Make sure to do all possible swappings
§ Pick the new remapping with the lowest cost
Repeat step 2 until there is no more cost reduction

Question 21

Q

what is the Simple Offset Problem?

Answer

A

How to most efficiently assign variables on the stack using auto-increment/auto-decrement modes
- Assumes one address register

Question 22

Q

access graph

Answer

A

used for simple offset problem. Undirected, weighted graph. Each node is a variable. Each edge represents an access a–(3)–b (a and b are accessed after one another 3 times, direction a/b b/a doesnt matter)

Question 23

Q

access graph: weight

Answer

A

sum of all the edges

Question 24

Q

access graph: cover

Answer

A

all nodes are connected, but no cycles, each node can have at most 2 neighbors

Question 25

Q

access graph: cover cost

Answer

A

weight - cost of cover

Question 26

Q

General Offset Assignment Problem:

Answer

A

Generalization of the simple offset problem
○ Have multiple address registers
Given N address registers, split the access graph into N covers

Question 27

Q

Relax Fixed Evaluation order

Answer

A

SOA and GOA assume a fixed-evaluation order => can we transform the access sequence to improve our assignments by swapping commutative and associative ops?

Question 28

Q

Least Cost Access Sequence (LCAS):

Answer

A

Must keep applying commutative and associative transforms on the access sequence until you get the least cost access sequence

Question 29

Q

pruning

Answer

A

used for finding the least cost access sequence: trying all possible combos is exhausting:
○ Start with first statement, generate all possible swaps
○ Prune the ones with max cost
○ Add second statement to subset
○ Prune the ones with max cost
Repeat

Question 30

Q

graph model for LD/ST parallelization

Answer

A

Each load/store is a node
- Edge between nodes means they can be merged into a parallel LD/ST instruction
  ○ They can be merged if their operands are in different memories
- Pick maximum number of edges that are disjoint (i.e. not sharing a node)

Question 31

Q

Moveable Boundary Problem

Answer

A

How can we rearrange the program order to get more parallelized LD/STs?

Question 32

Q

Pseudo-fixed boundary

Answer

A

○ Move stores as early as possible (assume other instructions are fixed): STORES CAN ONLY BE MOVED UP
Move loads as late as possible (assume other instructions are fixed): LOADS CAN ONLY BE MOVED DOWN

Question 33

Q

what is variable duplication?

Answer

A

copy a variable from one bank to another to improve parallelization

Question 34

Q

what is rematerialization?

Answer

A

recompute the value that you want duplicated and then store it into the other memory bank (instead of ld/st-ing)

Question 35

Q

What is the Dual-Bank Register Assignment Problem?

Answer

A

both operands must come from different banks, but this can lead to problems with other instructions with the same constraints

Question 36

Q

Dual-Bank Register Assignment is for what kind of instruction

Question 37

Q

Register Conflict Graph

Answer

A

For bank conflicts
- Conflict edge -
○ Edge between two registers indicating that they must be in different registers (i.e. both registers appear as source registers in an ALU instruction)
- Edge must exist in interference graph

Question 38

Q

Register conflict subgraph

Answer

A

Subgraph of the interference graph that only shows the conflict edges

Question 39

Q

what is a Bipartite Graph?

Answer

A

Graph that can be divided into two sets such that:

○ There is no edge between nodes in a single set

Question 40

Q

what is the No conflict law?

Answer

A

A RCG is conflictless (and thus can be allocated) if it is bipartite and A graph is bipartite if there are no odd cycles in RCG

Question 41

Q

how to detect odd cycles

Answer

A

Breadth-First Hierarchy:
○ Node level: the shortest path length from a node to the root
○ Parallel edge - if two nodes have the same node level and there is an edge between them, then this indicates an odd cycle
○ Can detect cycles of length 2k+1, where k is the node level

Question 42

Q

Live Range Splitting

Answer

A

method to split odd cycles: Split a live range into multiple variables to remove an edge from the graph