Branch prediction and superscalar Flashcards
How to speed up processors without using extra transistors?
By putting multiple cpu cores on a chip
What is symmetric multiprocessing?
All CPU cores on a chip are equal
How many processor cores and threads does skylake have
Skylake has 28 processor cores and 56 threads
What is the branch delay slot?
MIPS 32 has a feature called branch delay slot and is executed regardless of whether the jump happened.
There are 2 things that can be done inside this branch instruction
- Execute some instruction that is needed to be done anyway regardless of the branching
- Do a nop instruction.
What are the different ways to do branch prediction
- Randomly decide whether the branch will be taken or not
- Do what was done the last time
- Give hints to the assembler so it can decide whether the branch is likely to be taken or not.
What is the Refined branch prediction?
4 states are maintained from 0 to 3
Every time a branch is taken, the states are incremented.
Every time a branch is not taken, the state is decremented
If the state is 2 or 3 then the assembler will predict that the branch will be taken and it will predict not taken when the states are 0 or 1.
What is a scalar processor?
A scalar processor is a processor that can execute only one instruction per cycle.
What is a superscalar processor?
A processor has different execution units such as adder, multiplier, fpu, load/store etc.
Every instruction uses only one of these units.
So 2 instructions which do not use the same units can be executed simultaneously by the processor. Such processors are called superscalar processors.
A processor that can execute more than one instruction at at time.
How does a compiler help with a super scalar processor?
The compiler can helps superscalar processors by placing instructions that require different execution units in the processors one after the other. So for eg
an integer operation following a floating point operation
What is out of order execution and explain how it would be carried out in general.
When trying to execute 2 instructions in simultaneuously and there happens to be an unsolvable hazard ( be it structural or data ), then the processor can pick another instruction to execute and this is known as out of order execution as the instructions are not executed in the order they are found in.
The general idea behind this is:
- Instruction are fetched few at a time.
- Once they are decoded, they are sent back to a queue, one for each execution unit.
- Once they are executed, the MEM and the WB stages need to be done in the right order.
What is hyper threading?
The instructions from 2 threads don’t have a lot of dependencies and one processor can handle 2 different threads or processes and this is called hyperthreading. It’s like having 2 virtual cores