Computer Architecture Flashcards
What is a clock?
A clock drives the steps in the processor - everything happens on the clock “edge” as systems are synchronous
What does the Program Counter do?
Contains the address of the instruction to run
Increments after each instruction
What does the Instruction Register do?
Contains the instruction most recently fetched
What does the Memory address register do
Contains the address of a location in memory
What does the Memory buffer register do?
Contains a word of data to be written to memory or the word most recently read
What happens in the fetch part of the fetch-execute cycle?
- PC contains address of next instruction
- Address moved to MAR
- Address placed on address bus
- Control unit request memory read
- Result placed on data bus, copied to MBR, then to IR
- PC incremented by 1
What happens in an indirect fetch cycle?
it fetches the data stored in the memory location, to use the data to fetch/reach the memory location of the data required
Advantages of Von Neumann?
Simple - data and instructions stored in a single memory space
Cost-Effective - Smaller number of components
Disadvantages of Von Neumann
Bottleneck - Shared bus, simultaneous obtaining impossible
Memory Corruption - same memory space, erase
Advantages of Harvard
Faster processing - Two buses
Improved Security - Not stored in same location, no erase
Efficient use of resources
Disadvantages of Harvard
Complexity - intricate design
Higher cost
Less Flexibility
RISC characteristics
- Instructions of fixed length in a single clock cycle
- Pipelines to achieve one-instruction-per-one-clock-cycle
- Simple control logic to increase clock speed
- Operations performed on internal registers (load store instructions access external memory only)
CISC characteristics
- Binary compatibility (old binary code on newer systems)
- Complex control logic
- Use of micro-code
- Variable length instructions to save program memory
- Small internal register sets compared with RISC
- Complex addressing modes, operands can reside in external memory or internal registers
What is pipelining?
Pipelines overlap operations to aim to complete an instruction every clock cycle
What are the different ways of branch prediciton?
- Multiple streams
- Prefetch Branch Target
- Loop buffer
- Branch prediction
- Delayed branching
How does multiple streams work in pipelining?
- Have two pipelines
- Prefetch each branch into a separate, appropriate pipeline
- Waste the branch you didn’t need
Disadvantages of Multiple streams
- Leads to bus & register
- Multiple branches lead to further pipelines being needed
How does Prefetch Branch Target work?
- Target of branch is prefetched in addiction to instructions following branch
- Keep target until branch is executed
How does Loop Buffer work?
- Stores all the instructions of a loop in a buffer in the CPU
- Optimises the process of jumping away from the previous instruction
- Check buffer before fetching from memory
How does Static Branch Prediction work?
- Predicts one side (jump or not jump)
- If no jump, always fetch next instruction
-If jump, fetch target instruction
How would Branch Predicition be improved?
- Predict using the opcode
- Produce Statistics related to the likely hood of jumping
- Can learn
What is a superscalar processor?
A processor that completes more than one instruction per clock cycle
What is In-Order Issue, Completion?
- Issue instructions in the order they occur
- May fetch >1 instruction
Disadvantages of In-order Issue, Completion
- Not very efficient
- Instructions must stall if necessary
What is True Data Dependency?
It can execute fetch and decode two instructions simultaneously but not execute the second because it is dependent on the first
What is Procedural Dependency?
Can not execute instructions after a branch in parallel with instructions before a branch, preventing simultaneous fetches
What is Resource Conflict?
Two or more instructions requiring access to the same resource at the same time
What is Out-of-Order Issue, Completion?
- Decouple decode pipeline from execution pipeline
- Can continue to fetch and decode until this pipeline is full
- When a functional unit becomes available, an instruction can be executed
What is an Antidependency?
A register value is needed but changed in the next instruction
What is an Instruction Set?
The structure of a computer that a machine language programmer must understand to write a correct program for that machine
What is opcode?
The operation code is the instruction/task that has to be completed
What is the operand reference?
Where the data is referenced, the item of data is used from there
What is the result reference?
Where the instruction output gets put
What does the ALU do?
It does the arithmetic and logic operations in the CPU. It can also shift or rotate bits
What is the benefit of more addresses per instruction?
- More complex (powerful?) instructions
- More registers
- Register-to-Register operations are quicker
- Fewer instructions per program
What is the benefit of fewer addresses per instruction?
- Less complex instructions
- More instructions per program
- Faster fetch/execution of instructions
What is the important of data alignment?
Reading miss-aligned data may need multiple memory reads and shift which will negatively effect performance
What is Endianness
How are bytes in a word ordered and how are bits in a byte ordered
How are bytes ordered in big endian?
Most significant byte in the lowest numerical address
How are bytes stored in little endian?
The least significant byte in the lowest numerical address
What are characteristics of big endian?
- Memory dumps left to right (easy for western audiences)
- Big endian machines store character strings and integers in the same order
- Has to perform an extra operation (addition) when converting from 32 to 16 bit address
What is the main 5 instruction set architectures?
- Accumulator based
- Stack based
- Register-memory based
- Register-register
- Memory-Memory
How does Accumulator ISA work?
A value is loaded into the accumulator and another is added directly from memory with the result stored in accumulator
How does Stack based ISA work?
- Both operands pushed onto the stack
- The result is popped off the stack
How does Register-memory ISA work?
- One input is loaded from memory
- It gets added to by a value put into a register and the result is stored in a register
LOAD R3, A
ADD R1, R3, B
STORE R1, X
How does Register-Register ISA work?
- Operands are loaded from memory to registers
- Add uses the operands stored in registers
Advantages of Accumulator ISA
- Short Instructions
- One implicit operand, one explicit
Disadvantages of Accumulator ISA
- Single temporary storage location
- High memory traffic
Advantages of Stack ISA
- Simple model
- Short instructions
- Implicit operands
Disadvantages of Stack ISA
- The stack cannot be randomly accessed
- Stack becomes a bottleneck
Advantages of Register ISA
- Easy code generation
- Clever compiler optimisations
- Fast access to temporary values
Disadvantages of Register ISA
- Operands must be named
- Longer instructions
Advantages of Register-Memory
- Simple code generation
- Data can be accessed directly
Disadvantages of Register-Memory
- Functionally commutative operations, non-commutative behaviour
- Instructions require a variable number of cycles
Advantages of Register-Register
- Fixed-size instructions
- Simple code generation
- (Most) instructions require a similar, known number of cycles
- Fast
Disadvantages of Register-Register
High instruction count
Advantages of Memory-Memory
Produces compact code
Disadvantages of Memory-Memory
- Large variation in instruction size
- Large variation in execution time per instruction
- Memory access is the bottle neck
No longer used
What is Memory Connection?
- Consists of N words of equal length with unique address
- Memory receives addresses and receives control signals (Read, Write, Timing) and sends data
What is a CPU connection
- Reads instruction and data
- Sends control signals to other units
- Receives and acts on interrupts
What is a shared bus?
A common communication pathway
What does the address bus do?
Identify the source or destination of data
What does the Control bus do?
Control and timing of information
What are the typical control lines?
- Memory read/write signal
- I/O Port read/write signal
- Transfer Acknowledgement
- Bus request/grant
- Interrupt request/acknowledgement
- Clock signals
- Reset
What issues are caused by a single bus?
- Propagation delays, different devices may work at different speeds
Why is Timing import with interconnects and buses?
- Coordination of events on bus
- Normally Synchronous as events are determined by clock signals
What is the PCIe (Peripheral Component Interconnection express)?
A serial bus with multi- GiByte/s “lanes” where the speed depends on the version. it uses more lanes for a GPU card
How does a module use a bus?
- Obtain the use of the bus
- Transfer data and/or requests
- Synchronise and/or acknowledge
What does the PCIe do?
High-speed serial computer expansion bus standard. It is like a network with layers and addressing
What is SAS (Serial attached SCSI)?
- Very fast serial SCSI, compatible with latest SATA
- Very flexible due to the layers of the protocol
What are characteristics of a USB?
- Ideal for low-speed to high I/O devices
- Expandable as it is simple design and configuration that allows up to 127 devices
What are the elements of a USB’s hardware?
- Assumes a root hub connected to the main bus
- Cable contains four wires
- Data transmitted as 0 for a voltage transition and 1 as the absence of one
What are the four kind of USB frame signals?
- Control
- Isochronous (For real time devices where data should be sent/received at precise intervals)
- Bulk
- Interrupts (Used for regular polling of devices)
What do motherboard interconnects do?
- High speed links to chipset from one or more CPU packages
- Links are very similar to PCIe
What is a chiplet?
Putting lots of chips together
What is a chipset?
A set of electronic components on one or more integrated circuits that manages data flow
What is UPI (UltraPath Interconnect)
- Intel’s proprietary high speed link
- Point to point link between CPU chips and to chipset
- Handles cache-coherency
- Around 20GiB/s per link
What is BIOS (basic input/output system)?
- Firmware on the motherboard
- Hardware initialisation
- Booting
Where is BIOS stored?
In Flash memory
What does BIOS do?
- Used for I/O functions in MS-DOS to help standardise PCs
- finds a boot loader on disk/CD/USB
- Loads first sector of disk into RAM
What is UEFI (Unified Extensible Firmware Interface) ?
Replaces old BIOS and connects a OS to its firmware
What is Flynn’s Taxonomy?
Classification of computer architectures
What is SISD?
Single Instruction Single Data
What is MISD?
Multiple Instruction Single Data
What is SIMD?
Single Instruction Multiple Data
What is MIMD?
Multiple Instruction Multiple Data
What does SIMD need?
Special hardware i.e. Streaming SIMD Extensions and special software
What is SSE used for?
- Image processing
- Video processing
-array/vector processing - text processing
- General speed-up
What is SIMD used for?
Cuda and GPU processing
What is SMP (Symmetric Multiprocessors)?
- A MIMD System that has multiple CPUs share main memory and I/O
- The hardware manages contention and increases the performance especially multiuser/thread
How does a Typical SMP system work?
- Each processor has its own L1 and L2 cache
- Connected by a system bus, crossbar switch or other interconnect
- Main memory, I/O, etc are also connected to the interconnect
What is Heterogenous Multi-processing?
- Combine big performance cores with little energy efficient cores
- “Big” cores only used when performance is necessary, “little” cores used for most tasks
- Needs operating system support to fully leverage
What is Simultaneous multithreading (SMT) / Hyper-threading?
- Hardware multi-threading on superscalar CPUs
- Executes multiple instructions at the same time using redundant execution units in the processor
What is Data parallelism?
Split the data to make independent parallel tasks