Superscalar, VLIW, OoO Flashcards

Question

What is branch prediction (BTP - branch target prediction?)

Answer 1

Predict if a branch is taken. Hardware structure Uses previous history

Answer 2

2-bit counter 2-level adaptive predictor

Answer 3

Local branch predictor Only considers current branch, for each branch have a 2-bit saturated counter. 2 bits - 4 states (strongly not taken - strongly taken)

Answer 4

Looks at global history to find correlation between branches. Keeps a n-bit branch history. For each branch that passes, register as either a 1 or a 0. The length of the history is dependent on how many bits are used in the branch history register. The value in the branch history register is used to index into a pattern history table. Each entry in the pattern history table keeps a 2-bit saturated counter

Answer 5

Combines local and two-level predictors. Have an n-bit branch history. Take XOR of branch history and n-bit hash of current branch address The result is used to index into a table keeping the saturated counter. Local predictor: Used by using the branch address Global predictor: Using the branch history

Answer 6

If we want to increase the number of bits in the branch history register, we need to increase the number of bits in the hash and therefor increasing the size of the pattern history table. It also takes time to fill the bits of the branch history register.

Answer 7

Use more than one type of predictor, e.g. have one global- and one local-predictor. Have a selector that over time learns if the local- or the global-predictor gives best predictions. Index into the selector using the branch address. The selector provides a control signal to a mux that decides if the local- or the global prediction should be used.

Answer 8

Implies huge tables History takes a long time to learn

Answer 9

Tries to solve the problem of having large tables. Instead, uses a hash the tables, and use the history that is longest. Have a local 3-bit saturated predictor. Then we have additional history based predictors. For each of the history based tables, the history size is increasing. Does not store all of the bits in these tables, or else these would be very large. First, we take the prediction from the local predictor. Then we see if we have a hit in the smallest table. If so, we choose this prediction. Then we check in all the other tables, and chooses the prediction where we get a hit in the largest history table.

Answer 10

In parallell to branch prediction, we need to predict the target of the branch. The BTB contains addresses of predicted targets. The PC of the current branch is used to index into the BTB. Meaning, we check all the entries in the BTB to see if the first field in the BTB entry is the current target. The second entry will in case of a hit contain the predicted target address. Functions as a cache for target addresses. The first time a branch address index into the table, there will be a miss. The target address will in this case be calculated in the pipeline, and then stored in the BTB. This way, next time the branch address is used as index, we can retrieve the target address without needing to do a calculation.

Answer 11

A function(branch target) can be called from multiple addresses in a program. This makes it difficult to predict where the program should return. The RAS is a small cache that pushes the next instruction on the stack (PC + 4), and pop from the stack on return.

Answer 12

Need to flush instructions on the wrong execution path - flush from ROB.

Answer 13

If the same memory location is read/written to by different instructions. This instructions can use different registers to calculate the target memory address, and still end up with the same memory address.

Answer 14

WAW and WAR are eliminated because tomasulo ensure memory update in order. For loads, not allow load to happen, if any older ROB entry occupied by a store, has a destination field that matches the value of the address field in the load. Maintain the program order for the computation of the effective address of a load, with respect to all earlier stores. This ensures that loads cannot access a memory location written to by an earlier store, unless this writing has finished.

Answer 15

Execution has finished. Data is written on the Common Data Bus (watched by instructions in the issue buffers to see if their operands are ready). Mark reservation station as available

Answer 16

Update registers with reorder results. This happenbs when the instruction is at the head (last) in the ROB. Result can also be written back to memory. Instruction is removed from the ROB.