Chapter 4 - The Processor Flashcards

1
Q

What factors determine CPU performance?

A

Instruction Count, Cycles Per Instruction (CPI), Cycle Time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain the function of combinational and sequential elements in a processor.

A

Combinational Elements: These operate on data where the output is a function of the input. Examples include:

AND Gate: Y = A & B
Adder: Y = A + B
Multiplexer: Y = S ? I1 : I0
ALU (Arithmetic/Logic Unit): Y = F(A, B), where F is the function performed (e.g., add, subtract).

Sequential Elements: These store information and update their output based on clock signals. Examples include:
Register: Stores data and updates the value on the clock edge.
Register with Write Control: Updates only on the clock edge when the write control input is 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the components of a basic MIPS datapath?

A

Instruction Memory: For fetching instructions.
Register File: For reading and writing register values.
ALU: For performing arithmetic and logical operations.
Data Memory: For load and store operations.
Multiplexers: For selecting inputs based on the instruction type.
Control Unit: For generating control signals based on the instruction type.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the concept of pipelining in processors.

A

Pipelining improves processor performance by overlapping the execution of multiple instructions. Each instruction is divided into stages, and different stages of multiple instructions are processed simultaneously. The stages typically include:

Instruction Fetch (IF)
Instruction Decode (ID)
Execute (EX)
Memory Access (MEM)
Write Back (WB)
Pipelining increases instruction throughput but can introduce hazards that need to be managed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the types of hazards in pipelining, and how can they be mitigated?

A

Structural Hazards: Occur when two instructions require the same hardware resource simultaneously. Mitigated by duplicating resources or using separate instruction and data memories.

Data Hazards: Occur when instructions depend on the results of previous instructions. Mitigated by forwarding (bypassing) data from one pipeline stage to another.

Control Hazards: Occur due to branch instructions that affect the flow of control. Mitigated by branch prediction, stalling, or using techniques like delayed branching.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the techniques used for branch prediction in pipelined processors?

A

Static Prediction: Based on typical branch behavior, such as predicting backward branches (loops) as taken and forward branches (if statements) as not taken.
Dynamic Prediction: Based on the actual runtime behavior of branches, using hardware to track the history of branch outcomes and predict future behavior based on trends.
These techniques aim to minimize the performance penalty of control hazards by guessing the branch outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the necessity of pipeline registers between stages in a pipelined datapath.

A

Pipeline registers are essential in a pipelined datapath because they hold the intermediate data and control information between the stages of the pipeline. Without these registers, it would be impossible to maintain the flow of multiple instructions simultaneously, as the data required for each stage would not be preserved across clock cycles. These registers ensure that each stage of the pipeline can operate independently and concurrently, thus improving the overall throughput of the processor. For example, in the MIPS pipeline, registers are placed between the Instruction Fetch (IF), Instruction Decode (ID), Execute (EX), Memory Access (MEM), and Write Back (WB) stages to hold relevant data such as instruction codes, operand values, and computed addresses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are data hazards in pipelined processors and how can they be resolved using forwarding?

A

A data hazard arises if an instruction depends on the result of a previous instruction that has not yet completed its execution. There are three types of data hazards:

RAW (Read After Write): A subsequent instruction needs to read a register before a previous instruction writes to it.
WAR (Write After Read): A subsequent instruction writes to a register before a previous instruction reads it.
WAW (Write After Write): Two instructions write to the same register in sequence.

Forwarding (or data bypassing) resolves RAW hazards by routing the output of an instruction directly to the input that needs it, without waiting for it to be written back to the register file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe the cycle-by-cycle flow of instructions through the pipelined datapath and explain the concept of a “single-clock-cycle” pipeline diagram.

A

The cycle-by-cycle flow of instructions allows multiple instructions to be processed simultaneously in different stages, thus increasing the instruction throughput.

A “single-clock-cycle” pipeline shows the state of the pipeline in a single clock cycle, showing which resources are being used by which instruction at that moment. For instance, if we have three instructions (I1, I2, I3) in different stages, the diagram might show I1 in the Execute stage, I2 in the Decode stage, and I3 in the Fetch stage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain the steps involved in executing a load instruction in a pipelined processor.

A

Instruction Fetch (IF): The load instruction is fetched from the instruction memory using the Program Counter (PC).
Instruction Decode (ID): The instruction is decoded to determine the source register and the offset for the address calculation. The base address is read from the register file.
Execute (EX): The effective address for the load is calculated by adding the base address and the offset.
Memory Access (MEM): The calculated address is used to access the data memory, and the data is read from the memory location.
Write Back (WB): The data read from memory is written back to the destination register in the register file.
This sequence ensures that the load instruction correctly retrieves data from memory and places it into the specified register.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe how load-use data hazards are detected and managed.

A

A load-use data hazard occurs when an instruction tries to use data by an instruction before the data is available. To detect this hazard, the processor checks if an instruction in the ID stage is trying to read a register in the EX stage.

Inserting a bubble into the EX stage.
Preventing the update of the PC and IF/ID.
Use stall to complete its memory access and foward data correctly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain how branch hazards affect the pipeline and describe the role of dynamic branch prediction in mitigating branch penalties.

A

Branch hazards occur because the outcome of a branch instruction is not known until it is in the EX or MEM stage, leading to uncertainty about which instruction to fetch next. If the branch is taken, instructions that were fetched based on the assumption that the branch was not taken must be flushed from the pipeline, causing a stall and reducing performance.

Dynamic branch prediction mitigates branch penalties by predicting the outcome of branches based on historical data stored in a branch prediction buffer. This buffer is indexed by the addresses of recent branch instructions and stores whether the branches were taken or not. When a branch instruction is encountered, the processor checks the prediction buffer and speculatively fetches the next instruction based on the predicted outcome. If the prediction is correct, the pipeline continues without interruption. If the prediction is incorrect, the pipeline is flushed, and the correct instructions are fetched, updating the prediction buffer for future accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Discuss the role and design of a branch target buffer in modern processors.

A

BTB stores the target addresses of recently executed branches. It improves performance by reducing the branch penalty to a single cycle for taken branches and potentially zero cycles for predicted branches.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain the concept of an exception in a pipeline. How is it similar to a control hazard?

A

An exception in when an unexpected event requires special handling.

This is similar to a control hazard because both disrupt the normal flow of instructions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are restartable exceptions?

A

Restartable exceptions allow the faulting instruction to be flushed and then re-executed from scratch.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In a pipelined processor, how are multiple exceptions handled, especially when they occur simultaneously?

A
  1. Deal with the exception from the earliest instruction.
  2. Flush instructions that cause the earliest exception.
  3. Ensure precise exceptions.
15
Q

What is speculation and how does it handle exceptions?

A

Speculation guesses the outcome of certain operations, to allow execution to proceed without waiting for these operations to complete. If the guess is correct, the operation completes normally; if incorrect, the system re-executes the correct operations.

Static Speculation (Compiler-based): The compiler may include instructions to defer exceptions until it is certain that the speculative instruction’s results are needed.
Dynamic Speculation (Hardware-based): The hardware buffers exceptions until the speculative instruction’s completion. If an exception occurs, the system may choose not to commit the instruction’s results and flush the speculative instruction, simplifying hardware at the cost of more complex handler software.

16
Q

Provide an example of loop unrolling in the context of ILP and explain its benefits.

A

Loop unrolling involves replicating the loop body multiple times to expose more parallelism and reduce loop-control overhead. For example, consider a loop that processes elements of an array:
Loop: (Original)
lw $t0, 0($s1)
addu $t0, $t0, $s2
sw $t0, 0($s1)
addi $s1, $s1, -4
bne $s1, $zero, Loop

Loop: (Unrolled)
lw $t0, 0($s1)
lw $t1, 4($s1)
lw $t2, 8($s1)
lw $t3, 12($s1)
addu $t0, $t0, $s2
addu $t1, $t1, $s2
addu $t2, $t2, $s2
addu $t3, $t3, $s2
sw $t0, 0($s1)
sw $t1, 4($s1)
sw $t2, 8($s1)
sw $t3, 12($s1)
addi $s1, $s1, -16
bne $s1, $zero, Loop

Benefits:
1. Reduces loop-control instructions, increasing instruction throughput.
2. Exposes more parallelism by having more independent instructions that can be executed simultaneously.
3. Can improve pipeline efficiency and reduce stalls due to dependencies within the loop body.

16
Q

What does MIPS stand for?

A

Multiprocessor without Interlocked Pipeline Stages

17
Q

Why use code scheduling?

A

Reorder code to avoid use of load result in the next instruction

17
Q

What is stall on brand?

A

Wait until branch outcome determined before fetching next instruction

18
Q

Can longer pipelines readily determine branch outcome early?

A

No

18
Q

When do you stall?

A

When prediction is wrong

19
Q

Why do we need registers between stages?

A

To hold information produced in previous cycle

20
Q

What is load-use hazard detection and what does it do?

A

Check when using instruction is decoded in ID stage. If detected, stall and insert bubble

21
Q

Why use stalls?

A

Stalls reduce performance but are required to get correct results

22
Q

How does a 2-bit predictor improve upon the accuracy of branch prediction compared to a 1-bit predictor? Illustrate your explanation with a state diagram.

A

A 2-bit predictor uses a 2-bit counter to track the history of branch outcomes, providing four states: Strongly Taken, Weakly Taken, Weakly Not Taken, and Strongly Not Taken. The prediction changes state only after two consecutive mispredictions, which improves accuracy by reducing the likelihood of frequent mispredictions in loop constructs. The state diagram is as follows:

Strongly Taken (11) -> Weakly Taken (10) -> Weakly Not Taken (01) -> Strongly Not Taken (00)
<——————————————————–>

Transitions:
From Strongly Taken (11) to Weakly Taken (10) on a single misprediction.
From Weakly Taken (10) to Weakly Not Taken (01) on another misprediction.
From Weakly Not Taken (01) to Strongly Not Taken (00) on another misprediction.
Reverse transitions on correct predictions.

22
Q

Explain the shortcomings of a 1-bit branch predictor when handling nested loop branches. Use the following pseudo-code as a reference:

outer:


inner:


beq …, …, inner

beq …, …, outer

A

The 1-bit predictor has a significant shortcoming with nested loops because it can only remember the last outcome of a branch. In the given code, the inner loop branch is predicted as taken until the last iteration, where it is not taken. The predictor will mispredict this transition as taken. Similarly, when the inner loop starts again, the first iteration’s branch will be mispredicted as not taken. This leads to two mispredictions: one at the end of the inner loop (predicted taken but actually not taken) and one at the start of the next iteration (predicted not taken but actually taken).

23
Q

Differentiate between exceptions and interrupts. Provide examples for each and explain how they are handled differently in a processor.

A

Exception:

An exception is an unexpected event arising within the CPU, such as an undefined opcode, overflow, or syscall.
Example: An attempt to execute an undefined opcode triggers an exception.
Handling: In MIPS, exceptions are managed by the System Control Coprocessor (CP0). The processor saves the PC of the offending instruction in the Exception Program Counter (EPC) and the cause of the exception in the Cause register. The processor then jumps to a predefined handler address (e.g., 8000 0180).
Interrupt:

An interrupt is an unexpected event from an external I/O controller.
Example: An external device signaling the completion of an I/O operation triggers an interrupt.
Handling: Interrupts are often managed by vectored interrupts where the handler address is determined by the cause of the interrupt. For instance, different causes like undefined opcode or overflow have specific handler addresses (e.g., C000 0000 for undefined opcode, C000 0020 for overflow).

24
Q

Explain the mechanism of handling exceptions in MIPS architecture, focusing on the role of the System Control Coprocessor (CP0), the Exception Program Counter (EPC), and the Cause register.

A

In the MIPS architecture, exceptions are managed by the System Control Coprocessor (CP0). When an exception occurs:
The PC of the offending or interrupted instruction is saved in the Exception Program Counter (EPC).
The Cause register saves an indication of the problem (e.g., 0 for undefined opcode, 1 for overflow).
The processor then jumps to the exception handler located at a fixed address (e.g., 8000 0180). The handler can then take appropriate action based on the type of exception and resume normal execution.

25
Q

Describe vectored interrupts and their advantage over traditional interrupt handling mechanisms. Provide an example.

A

Vectored interrupts determine the handler address based on the cause of the interrupt, allowing for more efficient and faster response times. Instead of jumping to a single handler address and then determining the cause, the processor can directly jump to the appropriate handler.

Example:
Undefined opcode: handler address C000 0000.
Overflow: handler address C000 0020.

Advantage: This reduces the overhead and latency involved in handling interrupts, as the processor does not need to perform additional checks to determine the correct handler.

26
Q

[IMP]
Explain the purpose of the forwarding unit in a processor pipeline. How does it help mitigate data hazards? Provide an example with the following instructions:
ADD R1, R2, R3
SUB R4, R1, R5

A

The forwarding unit helps mitigate data hazards by allowing the pipeline to use the result of an instruction before it has been written back to the register file. This bypassing technique reduces stalls and improves performance.
In the given example:
The ADD instruction writes the result to R1.
The SUB instruction needs the value of R1 immediately in the next cycle.
Without forwarding, the SUB instruction would stall waiting for R1 to be updated.
With forwarding, the result of the ADD instruction is forwarded directly from the ALU output to the input of the ALU for the SUB instruction, preventing the stall.

27
Q

Describe the role of the hazard detection unit in a processor pipeline. What are its primary functions, and how does it handle data hazards?

A

The hazard detection unit’s primary role is to identify and manage data hazards in the pipeline to ensure correct instruction execution. Its functions include:
Detecting read-after-write (RAW) hazards, introducing stalls (bubbles), inserting NOPs (no operation).

28
Q

Given the following MIPS instructions, calculate the data paths needed for the forwarding unit:
1. ADD R1, R2, R3
2. SUB R4, R1, R5
3. AND R6, R1, R7
4. OR R8, R4, R9

A

Instruction 1: ADD R1, R2, R3
No forwarding needed as it is the first instruction.
Instruction 2: SUB R4, R1, R5
Forwarding needed from ADD’s ALU result to SUB’s ALU input for R1.
Forwarding Path: EX/MEM -> ID/EX
Instruction 3: AND R6, R1, R7
Forwarding needed from ADD’s ALU result to AND’s ALU input for R1.
Forwarding Path: MEM/WB -> ID/EX
Instruction 4: OR R8, R4, R9
Forwarding needed from SUB’s ALU result to OR’s ALU input for R4.
Forwarding Path: EX/MEM -> ID/EX

28
Q

For the given instruction sequence, identify the stages where data hazards occur and explain how they are resolved:
1. LW R1, 0(R2)
2. ADD R3, R1, R4
3. SUB R5, R3, R6
4. AND R7, R5, R8

A

Instruction 1: LW R1, 0(R2)
Loads value into R1.
Instruction 2: ADD R3, R1, R4
Data hazard: Needs R1 which is loaded by the LW instruction.
Resolution: Introduce a stall for one cycle. Use the result from MEM/WB stage of LW instruction.
Instruction 3: SUB R5, R3, R6
Data hazard: Needs R3 which is computed by the ADD instruction.
Resolution: Forward from ADD’s EX/MEM stage to SUB’s ALU input.
Instruction 4: AND R7, R5, R8
Data hazard: Needs R5 which is computed by the SUB instruction.
Resolution: Forward from SUB’s EX/MEM stage to AND’s ALU input.

29
Q

Explain how the hazard detection unit calculates when to insert stalls in the pipeline. Provide a specific example with calculations for the following instruction sequence:
1. LW R2, 0(R3)
2. ADD R4, R2, R5
3. SW R4, 4(R6)
4. SUB R7, R4, R8

A

Instruction 1: LW R2, 0(R3)
Loads value into R2.
Instruction 2: ADD R4, R2, R5
Data hazard: Needs R2 which is loaded by the LW instruction.
Calculation:
LW produces R2 at the end of the MEM stage.
ADD needs R2 at the beginning of its EX stage.
Insert one stall cycle to allow LW to complete and write R2 to the register file.
Instruction 3: SW R4, 4(R6)
Needs R4 which is produced by the ADD instruction.
Forwarding from ADD’s EX/MEM stage to SW’s MEM stage.
Instruction 4: SUB R7, R4, R8
Needs R4 which is produced by the ADD instruction.
Forwarding from ADD’s EX/MEM stage to SUB’s EX stage.