.131 WK 4-6 Flashcards
Architecture - Qiang Ni
Define ISA
interface between hardware + software - set of commands a processor can understand/execute
ALU provides…
registers (store operands + results), status flags (overflow, zero or negative)
ALU implements
arithmetic + logic operations
How to add 2 bits?
half adder aka S = A XOR B and S = A . B
How to add 3 bits?
full adder aka S = Cin XOR (A XOR B) Cout = A.B + Cin . (A XOR B)
What is the problem with ripple-carry adders and how to solve this?
slow as each stage has to wait for carry bit from above
can solve with carry select adders
How does a carry select adder work?
split problem by adding lower n/2 bits as usual but add upper n/2 bits using 2 full adders where one Cin is 1 and the other 0
Benefit of a carry select adder?
effectively doubles the speed - can keep splitting as long as cost/space allows
What is a status flag?
bits organised into a special register called the flags register - can be zero, negative or overflow
How does an ALU identify an overflow?
inputs’ sign bits are the same but the result has a different sign
Cin ⊕ Cout = 1 = arithmetic overflow error
How does the ALU perform simple multiplication + division?
Can be done with repetitive adders (slow)
Bit shifting (only works for powers of 2)
How does bit shifting work?
shifts number to left(multiply)/right(divide) by n^2
e.g shift left by 1 = x 2
shifts may be arithmetic, logical, rotate or rotate through carry
What are the types of volatile memory?
dynamic and static
Dynamic volatile memory is…
used for main memory
slower but cheaper than static
Static volatile memory is…
used for registers + caches
fast so relatively expensive
Main memory
each location holds 1 unit of info
identified by address (usually linear 0..n) which typically maps to multiple memory chips - address decoder maps linear addresses to specific location in a specific memory chip
What ways can machine architecture organise mutli-byte words in memory?
big-endian - location (byte) with the lowest memory address holds the most significant byte
little-endiain - location with the lowest memory address holds the least-significant byte
What does big-endian mean?
multi-byte words are organised so the location with the lowest memory address holds the most-significant byte
What does little-endian mean?
multi-byte words are organised so the location with the lowest memory address holds the least-significant byte
How does static memory store bits
stored bits are organised into multi-bit storage slots called registers
use networks of logic components (NAND gates) to build storage for individual data bits
What does combinatorial logic entail?
outputs are purely a function of its inputs
What does sequential logic entail?
its outputs are a function of its inputs AND its current outputs
involves feedback of the outputs to the inputs which is the hook on which we hang memory
set-reset (S-R) flip-flop
remembers its current state where Q0 = current + Q = next
high pulse on S -> Q=1(set)
high pulse on R -> Q=0 (reset)
(both S and R can’t be 1)
Limitations of S-R flip-flop
- has distinct set + reset inputs rather than a single input (could set state if 1 and reset if 0)
- no way of telling the f-f exactly when it should store input data (would like a latch signal to supervise)
Clocked D-type flip-flop
D = 1, L = 1 → Q = 1
Q0 = 1 but D = 0, L= 0 → Q = 1
Change D = 0 to set latch to 0
D-type flip-flop limitation
need output enable to have closer control of when existing data leaves + new data arrives
Master-slave flip-flop
slave is only readable where output enable is high pulse
if master latch = high → data signal is stored in master but slave is untouched
if master latch = low → data moves to slave
can implement registers using multiple master-slave flip-flops
What is a bus?
bundles of wires that connect elements of VN architecture
one wire per bit
What do the different buses do?
address bus - runs between CU and main memory to tell memory to access a specific address
data bus - runs between CU and main memory to send data
control bus
some processors may have internal + external buses (external may be narrower to reduce external pins + therefore cost)
What is bus width?
number of bits that can be read/written to/from memory at once for data bus
amount of addressable memory for address bus
Why do buses need output enable?
bus wires are shared so we must ensure there is only one active output at a given time
Control Unit
“little program” running inside the processor that endlessly executes the fetch-decode cycle - controls sequences + other architectural modules using their respective control lines (e.g latch, output enable, function select, carry + shift L/R)
CU is driven by clock ticks/pulses
How can CU fetch-ex loops be implemented?
as a FSM (hard-wired sequential logic - built directly in terms or NAND gates)
OR
as microcode (sequence of micro-instructions in a micro-memory)
What are the pros + cons of each CU fetch-ex loop implementation?
FSM - high performance but expensive + hard to evolve
microcode - more flexible but slightly lower performance
What is pipelining?
exploit inherent parallelism inside CU to speed up fetch-ex cycle
if split cycle into n-stages → get n x speedup
What are the hazards associated with pipelining?
- control hazards: occur when a control-transfer instruc. changes the flow of execution
- data hazards: occur when instruc. n depends on a result from previous instruc. or when two parts of the pipeline need access to the same data
- structural hazards: occur when two parts of the pipeline need access to the same piece of hardware
What happens when we encounter a pipeline hazard?
may cause pipeline to “stall” so we must “flush” it to continue
might considerably reduce speedup
I/O devices
- input = keyboard, mouse, track ball, touch screen, camera, environmental sensor
- output = display, printer, speaker, environmental actuator
- input + output = network interfaces (ethernet, wifi, bluetooth, etc.), disks, audio cards, MIDI devices
The I/O system enables…
attachment of I/O devices to the processor
Challenges of I/O device connection
- speed-gap challenge: I/O devices are often mechanical so run orders of magnitude slower than the CPU
- device diversity challenge: differences like data-access modes (read-only/write-only/read-and-write, access by individual byte/block and access randomly/sequentially), device specific operations + I/O protocols
What is a device driver?
software plug-ins inside the OS that abstract over device diversity by grouping sets of similar types of devices
What do device drivers do?
- register device with the OS + initialising it
- initiate data transfers to/from a device
- monitor status events from a device
- manage device/system shutdowns so OS doesn’t stop till all unwritten data is stored + device is left in a safe state
Two-fold classification (types) of device + device driver
character devices - send + receive 1 byte at a time - e.g. keyboard
block devices - send + receive multi-byte block at a time - e.g. hard disk
Two-fold classification of processor support for I/O
isolated I/O - processor provides dedicated physical pins for the connection of I/O devices + dedicated instructions for I/O operations
memory-mapped I/O = devices sit within the CPU’s linear memory address space
Pros + cons of memory-mapped I/O classification
simple, flexible programming model
but adds complexity to devices - need to understand larger addresses to work at memory speeds
Pros + cons of isolated I/O classification
suited to simple devices
though having fixed set of special I/O instruc. doesn’t help w/ device diversity
What is an example of isolated I/O classification?
Intel x86 instructions
- dedicated IN + OUT
- port addresses are typically 8 bits (narrower than main memory addresses)