Unit 5 Flashcards

1
Q

For every instruction, the first two steps are identical: Name them.

A

Send the program counter (PC) to the memory that contains the code and fetch the instruction from that memory.

Read one or two registers, using fields of the instruction to select the registers to read. For the LDUR and CBZ instructions, we need to read only one register, but most other instructions require reading two registers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

3rd step for memory-reference instructions

A

Use ALU for address calculation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

3rd step for arithmetic-logic instructions

A

Use ALU for operation execution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Every instruction must first be fetched from memory, based on the value of the blank.

A

program counter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Every instruction reads one or two blank

A

registers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

3rd step for branch instructions

A

use ALU for comparison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The blank stores the program that is to be executed.

A

instruction memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The blank stores the data needed by the running programs.

A

data memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The blank commands the datapath according to the instructions of the program by setting the control lines for each of the major functional units.

A

control unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The blank elements include the instruction and data memories, the register file, the ALU, and adders.

A

datapath

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

An operational element, such as an AND gate or an ALU.

A

Combinational element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A memory element, such as a register or a memory.

A

State element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A blank element has at least two inputs and one output.

A

state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The elements that operate on data values are all blank, which means that their outputs depend only on the current inputs.

A

combinational

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A sequential element is another name for a _____ element.

A

state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A clock input is present on a _____ element.

A

state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

An ALU is a _____ element.

A

combinational

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The approach used to determine when data is valid and stable relative to the clock.

A

Clocking methodology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

A clocking scheme in which all state changes occur on a clock edge.

A

Edge-triggered clocking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

A signal used for multiplexor selection or for directing the operation of a functional unit; contrasts with a data signal, which contains information that is operated on by a functional unit.

A

Control signal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The signal is logically high or true.

A

Asserted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The signal is logically low or false.

A

Deasserted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

An blank allows us to read the contents of a register, send the value through some combinational logic, and write that register in the same clock cycle.

A

edge-triggered methodology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

For the 64-bit LEGv8 architecture, nearly all of these state and logic elements will have inputs and outputs that are blank, since that is the width of most of the data handled by the processor

A

64 bits wide

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

A rising clock edge refers to the clock changing from blank

A

0 to 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

A unit used to operate on or hold data within a processor. In the LEGv8 implementation, the blank include the instruction and data memories, the register file, the ALU, and adders.

A

Datapath element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

The register containing the address of the instruction in the program being executed.

A

Program counter (PC)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

To execute any instruction, we must start by blank.

A

fetching the instruction from memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

A state element that consists of a set of registers that can be read and written by supplying a register number to be accessed.

A

Register file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

To increase the size of a data item by replicating the high-order sign bit of the original data item in the high-order bits of the larger, destination data item.

A

Sign-extend

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

There are two details in the definition of branch instructions (see COD Chapter 2 (Instructions: Language of the Computer)) to which we must pay attention:

A

The instruction set architecture specifies that the base for the branch address calculation is the address of the branch instruction.

The architecture also states that the offset field is shifted left 2 bits so that it is a word offset; this shift increases the effective range of the offset field by a factor of 4.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

The address specified in a branch, which becomes the new program counter (PC) if the branch is taken. In the LEGv8 architecture, the blank is given by the sum of the offset field of the instruction and the address of the branch.

A

Branch target address

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

A branch where the branch condition is satisfied and the program counter (PC) becomes the branch target. All unconditional branches are taken branches.

A

Branch taken

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

A branch where the branch condition is false and the program counter (PC) becomes the address of the instruction that sequentially follows the branch.

A

Branch not taken or (untaken branch)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

This simplest blank will attempt to execute all instructions in one clock cycle.

A

datapath

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

The normal case of fetching the next instruction memory requires blank, not PC + 1.

A

PC + 4,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

The blank must be able to take inputs and generate a write signal for each state element, the selector control for each multiplexor, and the ALU control.

A

control unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

For load register and store register instructions, we use the ALU to compute the memory address by blank .

A

addition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

For the R-type instructions, the ALU needs to perform one of the four actions (blank, blank, blank or blank), depending on the value of the 11-bit opcode field in the instruction (

A

AND, OR, subtract, or add

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

If the instruction is STUR, then ALUOp should be _____

A

00

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

If the instruction is STUR, then the ALU’s four control inputs should be _____.

A

0010

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

For LDUR and STUR instructions, the ALU function _____.

A

the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

If the instruction is ORR, then ALUOp should be _____.

A

10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

If the instruction is ORR, then as well as examining the ALUOp bits, the ALU control will also examine _____.

A

instruction’s opcode field (Instruction[31:21])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

If the instruction is ORR, then the ALU control will (after examining the ALUOp and opcode bits) output _____.

A

0001

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

From logic, a representation of a logical operation by listing all the values of the inputs and then in each case showing what the resulting outputs should be.

A

Truth table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

An element of a logical function in which the output does not depend on the values of all the inputs. Don’t-care terms may be specified in different ways.

A

Don’t-care term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

The blank, which as we saw in COD Chapter 2 (Instructions: Language of the computer), is between 6 and 11 bits wide and found in bits 31:26 to 31:21.

A

opcode field

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

The blank is always in bit positions 9:5 (Rn) for both R-type instructions and for the base register for load and store instructions.

A

first register operand

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

The blank is in one of two places. It is in bit positions 20:16 (Rm) for R-type instructions and it is in bit positions 4:0 (Rt) for the register to be written by load. That is also the field that specifies the register to be tested for zero for compare and branch on zero. Thus, we will need to add a multiplexor to select which field of the instruction is used to indicate the register number to be read.

A

other register operand

51
Q

Another operand can also be a 19-bit offset for blank or a 9-bit offset for load and store.

A

compare and branch on zero

52
Q

The blank for R-type instructions (Rd) and for loads (Rt) is in bit positions 4:0

A

destination register

53
Q

The field that denotes the operation and format of an instruction.

A

Opcode

54
Q

Also called single clock cycle implementation. An implementation in which an instruction is executed in one clock cycle. While easy to understand, it is too slow to be practical.

A

Single-cycle implementation

55
Q

In contrast, an unconditional branch instruction always branches, so the blank is not used.

A

ALU

56
Q

An implementation technique in which multiple instructions are overlapped in execution, much like an assembly line.

A

Pipelining

57
Q

.LEGv8 instructions classically take what five steps:

A

Fetch instruction from memory.
Read registers and decode the instruction.
Execute the operation or calculate an address.
Access an operand in data memory (if necessary).
Write the result into a register (if necessary).

58
Q

When a planned instruction cannot execute in the proper clock cycle because the hardware does not support the combination of instructions that are set to execute.

A

Structural hazard

59
Q

Also called a pipeline data hazard. When a planned instruction cannot execute in the proper clock cycle because data that is needed to execute the instruction are not yet available.

A

Data hazard

60
Q

Also called bypassing. A method of resolving a data hazard by retrieving the missing data element from internal buffers rather than waiting for it to arrive from programmer-visible registers or memory.

A

Forwarding

61
Q

A specific form of data hazard in which the data being loaded by a load instruction has not yet become available when it is needed by another instruction.

A

Load-use data hazard

62
Q

Also called bubble. A stall initiated in order to resolve a hazard.

A

Pipeline stall

63
Q

Also called branch hazard. When the proper instruction cannot execute in the proper pipeline clock cycle because the instruction that was fetched is not the one that is needed; that is, the flow of instruction addresses is not what the pipeline expected.

A

Control hazard

64
Q

A method of resolving a branch hazard that assumes a given outcome for the conditional branch and proceeds from that assumption rather than waiting to ascertain the actual outcome.

A

Branch prediction

65
Q

Blank increases the number of simultaneously executing instructions and the rate at which instructions are started and completed.

A

Pipelining

66
Q

Pipelining does not reduce the time it takes to complete an individual instruction, also called the blank.

A

latency

67
Q

Blank and blank help make a computer fast while still getting the right answers.

A

Branch prediction and forwarding

68
Q

The number of stages in a pipeline or the number of stages between two instructions during execution.

A

Latency (pipeline)

69
Q

The division of an instruction into five stages means a five-stage pipeline, which in turn means that up to five instructions will be in execution during any single clock cycle. Thus, we must separate the datapath into five pieces, with each piece named corresponding to a stage of instruction execution. Name them.

A

IF: Instruction fetch
ID: Instruction decode and register file read
EX: Execution or address calculation
MEM: Data memory access
WB: Write back

70
Q

During instruction fetch, the next instruction is fetched from blank. The right half of IM is shaded to depict that the memory is read.

A

instruction memory (IM)

71
Q

During instruction decode, the instruction’s fields are converted into datapath control signals, and simultaneously the blank is read. For simplicity, just the register file is used to depict this stage. The right half is shaded to depict the read (vs. write).

A

register file (Reg)

72
Q

During execute, the blank is used to perform the instruction’s operation or to compute an address, or an adder is used for branches.

A

ALU

73
Q

During data memory access, the blank may be read (for a load instruction) or written (for a store instruction). For load, the right half is shaded, indicating read. (For store, the left half would be shaded).

A

data memory (DM)

74
Q

During write back, the blank may be written by certain instructions (like R-type instructions). The left half is shaded to indicate write (vs. read). Although two Reg icons appear in the stylized depictions, only one register file exists.

A

register file (Reg)

75
Q

An instruction that does no operation to change state.

A

nop

76
Q

Although the compiler generally relies upon the hardware to resolve hazards and thereby ensure correct execution, the compiler must understand the blank to achieve the best performance. Otherwise, unexpected stalls will reduce the performance of the compiled code.

A

pipeline

77
Q

To discard instructions in a pipeline, usually due to an unexpected even

A

Flush

78
Q

Prediction of branches at runtime using runtime information

A

Dynamic branch prediction

79
Q

Also called branch history table. A small memory that is indexed by the lower portion of the address of the branch instruction and that contains one or more bits indicating whether the branch was recently taken or not.

A

Branch prediction buffer

80
Q

A structure that caches the destination PC or destination instruction for a branch. It is usually organized as a cache with tags, making it more costly than a simple prediction buffer.

A

Branch target buffer

81
Q

A branch predictor that combines local behavior of a particular branch and global information about the behavior of some recent number of executed branches.

A

Correlating predictor

82
Q

A branch predictor with multiple predictions for each branch and a selection mechanism that chooses which predictor to enable for a given branch.

A

Tournament branch predictor

83
Q

Also called interrupt. An unscheduled event that disrupts program execution; used to detect overflow.

A

Exception

84
Q

An exception that comes from outside of the processor. (Some architectures use the term interrupt for all exceptions.)

A

Interrupt

85
Q

An interrupt for which the address to which control is transferred is determined by the cause of the exception

A

Vectored interrupt

86
Q

A 64-bit register used to hold the address of the affected instruction. (Such a register is needed even when exceptions are vectored.)

A

ELR

87
Q

A register used to record the cause of the exception. In the LEGv8 architecture, this register is 32 bits, although some bits are currently unused. Assume there is a field that encodes the three possible exception sources mentioned above, with 8 representing an undefined instruction, 10 representing arithmetic overflow or underflow, and 12 representing hardware malfunction.

A

ESR

88
Q

Also called imprecise exception. Interrupts or exceptions in pipelined computers that are not associated with the exact instruction that was the cause of the interrupt or exception.

A

Imprecise interrupt

89
Q

Also called precise exception. An interrupt or exception that is always associated with the correct instruction in pipelined computers.

A

Precise interrupt

90
Q

The parallelism among instructions.

A

Instruction-level parallelism

91
Q

A scheme whereby multiple instructions are launched in one clock cycle

A

Multiple issue

92
Q

An approach to implementing a multiple-issue processor where many decisions are made by the compiler before execution.

A

Static multiple issue

93
Q

An approach to implementing a multiple-issue processor where many decisions are made during execution by the processor.

A

Dynamic multiple issue

94
Q

Two primary and distinct responsibilities must be dealt with in a multiple-issue pipeline:

A

Packaging instructions into issue slots: how does the processor determine how many instructions and which instructions can be issued in a given clock cycle? In most static issue processors, this process is at least partially handled by the compiler; in dynamic issue designs, it is normally dealt with at runtime by the processor, although the compiler will often have already tried to help improve the issue rate by placing the instructions in a beneficial order.
Dealing with data and control hazards: in static issue processors, the compiler handles some or all of the consequences of data and control hazards statically. In contrast, most dynamic issue processors attempt to alleviate at least some classes of hazards using hardware techniques operating at execution time.

95
Q

The positions from which instructions could issue in a given clock cycle; by analogy, these correspond to positions at the starting blocks for a sprint.

A

Issue slots

96
Q

An approach whereby the compiler or processor guesses the outcome of an instruction to remove it as a dependence in executing other instructions.

A

Speculation

97
Q

The set of instructions that issues together in one clock cycle; the packet may be determined statically by the compiler or dynamically by the processor.

A

Issue packet

98
Q

A style of instruction set architecture that launches many operations that are defined to be independent in a single wide instruction, typically with many separate opcode fields.

A

Very Long Instruction Word (VLIW)

99
Q

Number of clock cycles between a load instruction and an instruction that can use the result of the load without stalling the pipeline.

A

Use latency

100
Q

A technique to get more performance from loops that access arrays, in which multiple copies of the loop body are made and instructions from different iterations are scheduled together.

A

Loop unrolling

101
Q

The renaming of registers by the compiler or hardware to remove antidependences.

A

Register renaming

102
Q

Also called name dependence. An ordering forced by the reuse of a name, typically a register, rather than by a true dependence that carries a value between two instructions.

A

Antidependence

103
Q

An advanced pipelining technique that enables the processor to execute more than one instruction per clock cycle by selecting them during execution.

A

Superscalar

104
Q

Hardware support for reordering the order of instruction execution to avoid stalls.

A

Dynamic pipeline scheduling

105
Q

The unit in a dynamic or out-of-order execution pipeline that decides when it is safe to release the result of an operation to programmer-visible registers and memory.

A

Commit unit

106
Q

A buffer within a functional unit that holds the operands and the operation.

A

Reservation station

107
Q

The buffer that holds results in a dynamically scheduled processor until it is safe to store the results to memory or a register.

A

Reorder buffer

108
Q

A situation in pipelined execution when an instruction blocked from executing does not cause the following instructions to wait.

A

Out-of-order execution

109
Q

A commit in which the results of pipelined execution are written to the programmer visible state in the same order that instructions are fetched.

A

In-order commit

110
Q

Both pipelining and multiple-issue execution increase peak instruction throughput and attempt to exploit blank.

A

instruction-level parallelism (ILP)

111
Q

The organization of the processor, including the major functional units, their interconnection, and control.

A

Microarchitecture

112
Q

The instruction set of visible registers of a processor; for example, in LEGv8, these are the 32 integer and 32 floating-point registers.

A

Architectural registers

113
Q

Verilog can describe processors for simulation or with the intention that the Verilog specification be blank.

A

synthesized

114
Q

The inherent execution time for an instruction.

A

Instruction latency

115
Q

The first supercomputer.

A

CDC 6600

116
Q

An approach that uses dynamic hazard detection, generalized forwarding, and reservation stations.

A

Tomasulo’s algorithm

117
Q

A computer that used a four-stage pipeline to overlap fetch, decode, and execute.

A

The IBM 7030, also known as Stretch,

118
Q

The computer that introduced Tomasulo’s algorithm.

A

IBM 360/91

119
Q

Out-of-order instruction commits led to this unpopular situation.

A

imprecise interrupts

120
Q

The original design for a superscalar processor was a two-issue machine called _________

A

Cheetah

121
Q

_________ is a static multiple issue approach that was used in processors found in Cydrome and Multiflow mini-supercomputers.

A

VLIW

122
Q

_________ is a more compiler-intensive approach to multiple issue that removed many VLIW drawbacks.

A

EPIC

123
Q

_________ is an approach that uses aggressive loop unrolling and path prediction and has been used to exploit higher levels of ILP.

A

Trace scheduling