computer architecture pt2 Flashcards
FDE cycle
CPU sends value of PC to be copied to MAR
PC increments
get instruction identified by MAR and copy into MDR
move instruction from MDR to IR
move instruction from IR to CU for decoding:
-sends operation to ALU
-put address of data to be operated on in a register
-send address of data from register to MAR
-read data and place in MDR
-move data from MDR to an accumulator in the ALU
-complete operation and store result in an accumulator in the ALU
uses of alternative addressing modes
- addressing large amount of memory with few bits
- use indexes to loop or examine a table or array
- address registers (bc faster than addressing memory)
- relocate data or programs in memory
direct addressing
the address used is the address holding the data to be operated on
immediate addressing
the address read in the instruction is the data to be used (e.g. for constants like 1)
indirect addressing *
the address used is the address of the memory location holding the address of the data item to be operated on
register indirect addressing
the address read in the instruction is the address of the register holding the data item to be operated on
indexed addressing @
the address in the instruction has an index (stored in a register) which is added to obtain the address of the data
impure code
when instructions are changed during execution
changed instructions have to be reset before being run again else program will run differently
full instruction structure
opcode + addressing mode + address 1 + address 2
absolute addressing
the address read is the address that should be gone to
relative addressing
the address read in the instruction is an offset to the current instruction address (i.e. program counter value)
base offset addressing
the address read in the instruction is offset by the current value in a special base register
type of instructions
Single operand manipulation: negating, inc/decrementing, setting reg = 0
Arithmetic: floating point, add/sub/div/multiplication
Program control: branching, call, jumps, return
Boolean logic: shift/rotate, bit manipulation- allowing the design of own flags so controlling program flow. inc set/test instructions
Data movement: between registers, reg+memory + memory locations
Stack: pop/push in a LIFO structure
Multiple data, single instruction: multimedia- single instruction on a large amount of date e.g. pixels
MIPS principles
- simplicity favours regularity
- makes the common case faster
- smaller is faster
- good design demands good compromises
1.simplicity favours regularity
consistent instruction format with same number of operands- 2 sources + 1 destination- : easier to encode + handle in hardware
- makes common case faster
only includes simple, commonly used instructions
for complex operations, use more instructions
results in simpler, smaller, faster hardware
(RISC)
- smaller is faster
few registers used
- good design demands good compromises
more instruction formats = more flexibility
number of instruction formats kept small to adhere to principle 1 + 3
other formats may appear in assembler but transformed in machine code to fit format
3 instruction formats
- R type: register operands
- I type: immediate operand
- J type: jumping
r-type instruction formats
opcode (6 bits, 0 for r-type), rs, rt (source registers, 5 bits), rd (destination register, 5 bits), shamt (shift amount, 5 bits, 0 if not shift operation), func (function, operation to be done, 6 bits)
register field values: shift bits in register 17 left 5 places and put in in register 16
opcode = 0 rs = 0 rt = 17 rd = 16 shamt = 5 func = 0
i-type instruction formats
opcode (6 bits), rs, rt (source registers, 5 bits), imm (16 bit twos complement immediate)
rt is used as a destination for some instructions e.g. addi
addi $s0, $s1, 5
(immediate type) add the value in register 17, and the number 5, and place in register 16
lw $t2, 32($0)
load word at address 32+ contents of register 0, and store in register 10
sw $s1, 4($t1)
store word in register 17 at address 4+ contents of register 9
li $s0, 5
aka ori $s0, $0, 5
(immediate type) load immediate 5 and store in register 16
in machine code, swap round $0 and $s0
j-type
6 bit opcode, 26 bit address
typically use r type rather than this in assembler: jr name/ jr $0 i.e. jump to name or address contained in register
types of addressing the operands
register only: add $s0, $s1, $s2
immediate: addi $s0, $s1, 5 (5+ contents of s1, store in s0: s1 = rs, s0 = rt)
base address: address of operand = base address + signed immediate e.g. lw $1, 0($2)
PC-relative (jump so far from current position): beq $t0, $0, 3 (offset of 3 from PC value if first two values are equal)
loading 32 bit words with only 16 bit intermediates
first load first half then add on the second half
lui $0, 0xFEDC
ori $s0, $s0, 0x8765
how to make OS calls
use register $v0, system call depends on contents of $v0. assembly code ‘syscall’
multiplication MIPS
mul: 32 bit result, no overflow checking
mult: 32 bits multiplied resulting in 64 bit result (mult $s0, $s1) stored in special reg: lo/hi
turn mult in to mul $v2,$s3,$t0 by mflo $v2
division MIPS
div $s0, $s1
quotient in lo
remainder in hi
move from lo/hi special registers
mflo $s2
mfhi $s2
MIPS: jal instructions
jump and link:
e.g. jal adder, jumps to portion labeled “adder:”, and puts return address (current address) in $ra
return back justing jr $ra
caller vs callee
caller: passes arguments to callee using $a0 - $a4 and jumps to callee using jal
callee: performs function + returns result using $v0-$v1, returns to point of call using jr $ra, shouldnt overwrite registers needed by caller e.g. $s0-$s7, $ra, $sp- if using the registers, values kept on stack
stack
dynamically sized chunk of memory with $sp containing address of head of stack
pushing to stack
move stack pointer down one position and write value
addi $sp, $sp, -4
sw $s0, 0($sp)
popping from stack
read and move pointer back up four places
addi $sp, $sp, 4
lw $s0, 0($sp)
$ra when calling recursive procedures
push value of $ra onto stack before calling a function so can reinstate afterwards
conventional orders of caller
- arguments in $a0-$a3
- save any needed registers: $ra, maybe $t0-$t9
- jal callee
- restore registers
- check $v0 for result
conventional orders of callee
- save registers that may be disturbed $s0-$s7
- perform function
- result in $v0
- restore registers
- jr $ra
architecture
consists of the instruction set + implies an architectural state:
- value of PC
- state of all architecture defined registers
microarchitecture
how to implement an architecture in hardware. consists of:
- datapath: functional blocks and registers
- control: control signals
MIPS
- approx 80 instructions
- 32 general purpose instructions $0- $31
- super pipelined: each instruction is broken down into a sequence of ‘micro’ instructions
- RISC
- $0 is special and always contains 0
program counter
32 bit register: PC’ = input = address of next instruction, output PC = points to current instruction
instruction memory
one read port
read port: 1 read address, A, of 32 bit length, reads 32 bit data (instruction) and place on RD
register file
one read port one write port.
read port: 2 input addresses of 5 bits (addressing one of 32 registers). addresses read from these are placed on RD1 and RD2 outputs
write port: CLK, WD- 32 bit data to be written, A3 5 bit destination address, WE- write enable: if 1 then written to specified register on a rising edge of clock
data memory
one read port one write port
read port: 32 bit address, A
write port: 32 bit write data, WD
WE: write enable, if 1 WD written into address A on rising edge else data read from A and placed on RD output (32 bit)
all reads in MIPS state elements are ‘combinational’
change in address results in change in read data (after a delay). writes are only on a rising clock edge
further registers
32 in ‘floating point unit’: each hold a single precision floating point value. pairs store double precision numbers
mul.s $f0, $f1, $f2 multiply f2 and f1 and store in f0
mul.d operands are doubles stored in
$f2,$f3 and $f4,$f5, result is stored in $f0 and $f1.
also exception and interrupt handling registers
mtc1 $s0, $f0
mfc1 $f0, $s0
move between CPU and FPU
move to coprocessor 1
move from coprocessor 1
lwc1 $f0, 4($s0)
swc1 $f0, 4($s0)
load word save word
memory address is in CPU
l. d $f0, 4($s0)
s. d $f0, 4($s0)
load double and save double precision numbers
casting:
cvt.s.w $f0,$f0
converts from int (w) to single FP
Single precision (32-bit) floating point numbers have
1 sign bit
8 bit exponent
23 bit mantissa
casting errors
if number needs more digits for mantissa and only 23 provided
floating point branches: Logical comparisons
equality: c.eq.s, c.eq.d Less than: c.lt.s, c.lt.d Less than or equal: c.le.s, c.le.d (ge = greater than or equal) result of logical comparison stored in fpcond
fp branches: conditional branch
bclf: branches if fpcond is FALSE
bclt: branches if fpcond is TRUE
MMIO
memory mapped input output: devices connected to same address bus as main memory and are each allocated address space (input devices read like any other memory, output devices written to like any other memory). CPU doesn’t see difference between IO and memory
port-mapped IO
have a separate address space for IO and use special IO instructions to access IO
MARS: keyboard allocated 2 memory addresses
0xffff0000 control: bit 0 = ready bit, 1 if unread input from keyboard
bit 1 = interrupt bit, if 1 then keyboard triggers an interrupt
0xffff0004 data: contains the ASCII code of the last key pressed. on read it triggers the ready bit to be reset
MARS: display output
0xffff0008 control: bit 0 = ready bit, 1 if display is ready to take another input
bit 1 = interrupt bit, if 1 then display ready triggers an interrupt
0xffff000c data: set with the ASCII code of the character
on write: triggers the ready bit to be reset
exception
anything that happens that triggers an unscheduled function call in the middle of a user program. caused by hardware = interrupt, software = traps e.g. undefined function
when an exception occurs
program records cause of exception, jumps to exception handler and returns to program
special addresses used when exception occurs
EPC: stores value of program counter
status: records whether user or kernel code was being executed
cause: records cause of exception
BadVaddress: stores the bad address that caused an exception
-these are stored in coprocessor 0
mfc0 $t0 EPC- moves contents of EPC to t0
flow during exception
processor stores cause and exception PC in cause and EPC
processor jumps to exception handler
exception handler:
saves register on stack
reads cause from cause register mfc0 $t0, Cause
handles exception
restores registers
returns to program
mfc0 $k0, EPC
jr $k0
-must save all registers so registers can be reinstated and program run as if nothing had happened