Semester 1 Flashcards

Question

What are registers?

Answer 1

- Fast local memory on the CPU - Very small storage locations used to hold data temporarily - They have very high read/write speeds

Answer 2

External (peripheral) devices that can be categorised into 3 groups: - **Secondary storage devices** e.g. hard disk - **Input devices** e.g. a keyboard/sensor - **Output devices** e.g. a speaker/actuator Each peripheral also has a **device driver** that provides a software interface for the device

Answer 3

- A series of **parallel wires** that **connect internal components** of a computer system, allowing signals to be passed through them

Answer 4

- The number of parallel wires in a bus has a **direct relationship** to the number of bits that can be transferred.

Answer 5

Unidirectional —--------> (away from CPU) - Transports memory addresses - Bigger width = larger range of addresses thus - increasing the computers amount of addressable memory - 1 wire = 2^1 addresses

Answer 6

Bidirectional main memory ←—--------> processor - Sends data and instructions - Bigger width = larger volume of data transfer - 1 wire = 1 bit

Answer 7

Bidirectional Main memory ←—-------->CPU(Processor) - Carries control signals to regulate operations - Higher clock speed (a control signal) = More instructions per second + higher temp/power consumption - can control Clock, memory read/write

Answer 8

- All data/instructions are stored in the main memory - Instructions are sent to the processor along the system bus to be executed - Data sent to/from the processor is sent along the system bus - Any input/output is performed by i/o devices with the data travelling from them to the cpu/main memory

Answer 9

- The main difference to the Vonn Neumann Architecture is it has **separate buses for data and instructions**, making it more efficient and faster

Answer 10

- It uses other parts of the machine to do its job

Answer 11

- machine code instructions stored in main memory are fetched and executed serially by a processor that performs arithmetic and logical operations.

Answer 12

FETCH 1. The content of the PC is copied into the MAR (PC —> MAR) 2. The contents of the MAR is transferred to main memory by the address bus (MAR —Address bus—> MM) 3. The instructions from MM are sent to the MBR/MDR by data bus simultaneously (MM —data bus—> MBR) 4. The program counter is incremented by 1, or points to next instruction (PC +1) 5. The content of the MBR is copied to the Current Instruction Register (MBR -> CIR) DECODE 1. The content of the CIR is decoded by the control unit 2. The decoded instruction is split into opcode + operand EXECUTE 1. Any data required by the instruction that isn’t present in registers is fetched 2. The instruction is carried out 3. Results of any calculations are stored in general purpose registers, main memory or an accumulator (e.g. ALU for arithmetic calculators)

Answer 13

- Doing multiple parts of the FDE cycle in parallel so that computations can be achieved at a faster rate. (e.g. superscalar architectures to make processes parallel) - Parallel processing

Answer 14

- Relatively small capacity set of locations that sit close to the processor, used to store instructions and data most frequently used. - More cache = More instructions can be queued and carried out L1 Cache is the smallest and fastest L2 Cache is shared by cores, but larger or slower L3 and new L4 Slow but large and sit on or near the processor - L1 is closest to the CPU - Cache can be cleared to increase the speed

Answer 15

Source: Where the voltage enters the transistors.

Answer 16

Drain: Where the voltage leaves the transistors.

Answer 17

The terminal that controls the flow

Answer 18

**n-MOS**: **Gate** driven **positive** **allows** (works) current to flow between Source and Drain. Gate driven negative isolates Source and Drain (stops) **p-MOS**: **Gate driven negative allows** (works) current to flow between Source and Drain. Gate driven positive isolates Source and Drain (stops), **PMOS little circle on GATE**

Answer 19

+V = 1 = High voltage -V = 0 = Low voltage

Answer 20

CMOS Inverter has the same functionality as a NOT gate.

Answer 21

2 p-MOS and 2 n-MOS

Answer 22

2^N where N is the number of inputs. e.g. 3 inputs would be 2^3 = 8 possible combinations in a truth table

Answer 23

If you have more than 2 inputs, if the number of 1’s is odd, the output is 1, otherwise it is 0.

Answer 24

You can make XOR gate with just and 4 NAND gates (1, 2 then 1 again)

Answer 25

1. Using AND gates, we can draw the outputs of M by focusing on the rows of the truth table in which M = 1, drawing 1 AND gate for each 1 output 2. Not (line over character) symbolises 0 as the input, NOT 1. 3. Truth table consists of only 1’s or 0’s 4. Output of 1= 0, 2= 0, 3 = 0 and 4 = 1, and gate 8 is an OR gate, meaning output of M in this case is 1.

Answer 26

We can use a NAND gate as a NOT gate if both values put into the NAND gate are the same.

Answer 27

We can use a NOR gate as a NOT gate if both values put into the NOR gate are the same.

Answer 28

Your second NAND can essentially function as a NOT gate, taking in two of the same value to change the NAND into an AND.

Answer 29

1. First two NOR become NOT via both taking in the same input, NOT + NOR creates an AND Gate as the end result. 2. You can do something similar but replace all NOR with NAND to create an OR gate.

Answer 30

Multiple input signals to combine for a singular common output. Always consists of at least one NOT gate, AND gate and OR gate.

Answer 31

If the select signal is 0, the output will always be the value of D0

Answer 32

- It is an 8 input multiplier that can compute the majority vote function (outputs 1 when most input is 1)

Answer 33

- Add the value of ABC to decide which D to output, and then outputs the value associated. e.g. if ABC = 7, it chooses D7 and as shown via the diagram 1 is output

Answer 34

- A decoder converts coded inputs into coded outputs where the input and output codes are different - Large number of outputs, 1 output should be 1, the rest should be 0.

Answer 35

A encoder takes all data inputs one at a time and converts them to a single output

Answer 36

- input signal D which is connected to all outputs 1. takes the inputs in binary and adds them (ABC) 2. Outputs the number they total 3. e.g., if A = 1, B = 1 and C = 1, this is 111 = 7, so the system would output a 1 where F7 is seen 4. aka 10000000

Answer 37

Decimal system, digits 0-9 Each position has a place value/weight, 1 = 10^0, 10 = 10^1, 100 = 10^2

Answer 38

Notation: to denote the base of a number with the base as a subscript (often omitted for base 10)

Answer 39

starting at the value in the lowest place value, multiply each digit by 2^place value

Answer 40

starting at the value in the lowest place value, multiply each digit by 8^place value

Answer 41

starting at the value in the lowest place value, convert the HEX digits, then multiply each digit by 16^place value

Answer 42

1. For hex: Split from right to left, in groups of 4 for hexadecimal, and then convert each nibble to its hex value 2. For octal: Split from right to left, in groups of 3, and then convert these groups into octal (using denary conversion), which you can then write out

Answer 43

1 + 1 = 0, carry 1 1 + 0 or 0 + 1 = 1 1 + 1 + 1 = 1, carry 1

Answer 44

1. Decimal to binary conversion can also be done via working down the binary e.g. if you’re trying to put 280 in binary, you can’t take away 512, so that is a 0, whereas 256 would be a 1, and you minus this and continue down. 2. see image

Answer 45

- an AND gate and an XOR gate - takes two inputs to ADD - Sum is XOR gate, this is the lowest place value when adding - Carry is AND gate, this is carried to the next adder for next digit addition

Answer 46

It takes 3 inputs instead of two, it intakes a carry from previous arithmetic and outputs a carry additionally.

Answer 47

Delay between gates in activating, you can separate the time for the carry and the inputs because they are put in separately (when referring to full adder).

Answer 48

Shift a binary value up or down a place

Answer 49

if every logic gate round creates 1t, 3 time delay as it is made with 4 NAND GATES, and 3 lines.

Answer 50

AND, AND, XOR, XOR (second one for the sum), + OR for the carry

Answer 51

Between two strings of equal length, it is the number of positions at which corresponding symbols are different. In other terms, it measures the minimum number of substitutions required to change one string in to the other.

Answer 52

hamming distance >= d+1

Answer 53

hamming distance >= 2d + 1

Answer 54

Even parity = Even number of 1’s (after adding the parity bit)

Answer 55

Odd parity = Odd number of 1’s (after adding the parity bit)

Answer 56

The percentage of check bits within the word size e.g. if word size was 4, and check bits were 3, the percent overhead would be 75%

Answer 57

- Check each circle for its individual parity bits and then look for errors when crossing over

Answer 58

6 XOR gates | 1 LESS

Answer 59

The error is in Data bit D1 ## Footnote Treat each circle as checking its own parity, often the error is the odd crossover out

Answer 60

There is no error

Answer 61

Data bit D2 ## Footnote Treat each circle as checking its own parity, often the error is the odd crossover out

Answer 62

Even parity: XOR gate Odd parity: XNOR gate

Answer 63

S = Sender R = Reciever For each row, you start at the x where the number is e.g. first x on row 4 starts at 4, then prints 4 x’s, then takes 4 spaces, and repeats Rows continue until the length of the Reciever has been mapped

Answer 64

Contains clock which syncs up transfer, and positive feedback is recieved using data previously calculated and pulling it from memory.

Answer 65

A is the data send out B is the data from A but with time delay C is the data gathered when both A AND B are on their rising edge. | reminder: Clock signals have a rising edge and a falling edge

Answer 66

(a) is NOR latch in state 0 (b) is NOR latch in state 1

Answer 67

The system is vulnerable. They need to be opposite eachother to be stable (e.g. outputting 1 and 0)

Answer 68

- Control and ensure more synced system. - D is connected to both AND gates, but its inverse is connected to the bottom and, meaning both Q and Qbar cannot be high.

Answer 69

Inclusion of AND gate means that in no scenario both values can be 1, preventing vulnerability. However, there is a small time delay for taking the negation of D which may allow a small time frame in which both q and not q are positive.

Answer 70

D-type Flip flop. It is adjusted to make the time delay essentially negatable as it is very small.

Answer 71

- Master and slave (two D-type flip flops) are never on at the same time. - It requires a symmetric clock at high speed.

Answer 72

Each flip flop covers 1-bit from a register.

Answer 73

Clear empties the signal Large triangle – Amplified signal to ensure it is distinguishable between high and low voltage as it travels and potentially weakens

Answer 74

A complete word

Answer 75

Any high-level language has to be defined in terms of microinstructions, which in turn have to be supported by a microarchitecture.

Answer 76

- Having fewer microinstructions - Adding more hardware

Answer 77

latches separate parts of the circuit for us, (allowing for pipelining sections)

Answer 78

If a particular storage location is referenced, it is likely that nearby memory locations will be referenced in the near future

Answer 79

Tag = 2^16 (half) Line + word + byte = 2^16

Answer 80

Cache miss: Indicated by a comparator, the memory value has never been fetched before

Answer 81

Value has been fetched into the cache previously, quicker execution

Answer 82

**If a byte is present in cache, it can only be in one place** e.g. 64kbyte cache: - Cache contains 2k lines/entries - Each line contains 32 bytes of data - Each line also contains a 16 bit TAG and a 1 validation bit - The validation bit is 1 if there is real data in that line - The TAG contains the 16 most significant bits of the actual address of the data contained in that line

Answer 83

1. CPU transmits 32 bit address of X 2. X 5-15 selecs cache line 3. Cache is initially empty so cache line valid bit is 0 4. Comparator indicates cache miss (no pre-fetched information) 5. (Due to cache miss) Whole address is propagated to main memory which outputs requested data word. 6. Requested data word is read by CPU. 7. Requested data word is also written to cache line selected by X 5-15 in data word, selected by X 2-4 8. Cache reads 7 more words from main memory to complete cache line data part. 9. X 16-31 is written to the tag part of the cache line selected by X 5-15 10. The Valid bit of the cache line selected by X 5-15 is set to 1 to indicate that a tag and 8 words of data have been written. 11. Cycle somewhat repeats, main memory plays no part in this read as valid bit is now 1 due to fetched data, so a cache HIT occurs instead.

Answer 84

Number of lines (n) = Cache Size(in bytes)/Line size n = 32 x 2^20 bytes/ 64 bytes = 524288 or 2^19

Answer 85

1. 2^20 bytes 2. 2^10 bytes 3. 8 bits

Answer 86

Cache size (in bytes)/ Line size(in bytes)

Answer 87

last 11 bits - 01101001 110

Answer 88

Log2(No of rows)

Answer 89

Log2(512) = 9

Answer 90

How data is moving

Answer 91

Allows more than one instruction to be stored to be fetched and therefore in a cycle more than one action can be performed.

Answer 92

Instruction set for a microarchitecture.

Answer 93

A small operative architecture that performs logical instructions i.e. FDE cycle

Answer 94

16, 12 general purpose registers. (all 32 bits) R13 = Stack pointer R14 = Link register R15 = Program Counter CPSR = Current Program Status register, allows for specific tasks, it is read only

Answer 95

Point to the next instruction?

Answer 96

Current Program Status register, allows for specific tasks, it is read only

Answer 97

- N – Negative result from the ALU - Z – Zero result from the ALU - C – ALU operation Carry out - V – ALU Operation oVerflowed

Answer 98

– A line of data from a given location in main memory always maps onto the same cache line

Answer 99

– A line of data from a given location in main memory always maps onto the same cache line – This can result in thrashing where data moves to and from memory frequently, limiting CPU performance

Answer 100

* A set-associative cache replicates every line a fixed number of times * Data fetched from main memory could be in more than one place in cache so a fixed number of places must be searched * Greatly reduces thrashing and improves performance

Answer 101

- Line: a block of contiguous words from main memory - Offset: lower bits identify a word within a line - Way: a subdivision of cache where Line is stored - Tag: top bits of the 64-bit address tell the cache where the Line came from in main memory - Index: middle bits determine in which line of the cache the address can be found - Set: Cache lines from all Ways sharing a particular Index - 7/8. Tag and RAM * CPU Read1: fetch Line from main memory and store in Way * CPU Read2: search Indexed Set for Tag

Answer 102

Consider an instruction fetch: 1. There is a cache lookup in the L1 data cache 2. If it is found in the L1 Cache, the data is then read from the L1 cache and returned to the core (**OPTIONAL IF IT ISNT IN THE L1 CACHE**) 3. if it isnt found in the L1 cache, but IS found in the L2 cache, the cache line is loaded into the L1 cache from the L1 cache and data is returned to the core 4. If is not in either L1 or L2, then data is LOADED into both of these caches from MM or L3 cache and supplied to the core

Answer 103

- Exists to synchronise the MM with the Cache (incase cache is more recent) 1. A write updates the L1 Data Cache only and marks the cache line as dirty 2. Write to L2 system delayed until the line is evicted (way needed for different data)

Answer 104

1. A write updates both the L1 Data Cache and the L2 system immediately 2. This does not mark the cache line as dirty

Answer 105

- Maintains duplicate copies of L1 data cache tags from all cores - The SCU monitors the line fetch memory requests and transfers between cores if dirty.

Answer 106

- Those used by the user or the compiler/assembler

Answer 107

- Those used by the actual memory system

Answer 108

- Translation table with page table entrys to convert virtual to physical addresses.

Answer 109

- the memory management unit (MMU) uses the most significant bits of virtual addresses of code/data to index them into a translation table containing the physical addresses - translation is carried out automatically in hardware - transparent to the application - MMU also controls memory access permissions, ordering and cache policies.

Answer 110

It executes a microprogram

Answer 111

Memory accesses

Answer 112

- Memory that you cant see, read or write - it is the instruction set of the computer that specifies what is written there: high to low level instruction - It is used for holding microprograms - Contains the MIR (micro instruction register)

Answer 113

Microprogram counter, it is similar to program counter that points out to the next instruction. This does the same job but internally for the CPU to point out to the next microprogram

Answer 114

- 0 0 : A AND B - 0 1 : NOT B - 1 0 : A OR B - 1 1 : A PLUS B

Answer 115

IF = Instruction Fetch Unit ID = Instruction Decode Unit EX = Execute Unit MEM = Memory Unit WB = Writeback Unit

Answer 116

True - Log2(64) = 6

Answer 117

- It needs an address equivalent to the power of bites of the size of the memory - e.g. if memory is 64 bytes, it is 2^6 bits and therefore requires an address of 6 bits - Top two bits signify the part of the memory e.g. if it is 10, you want access to part 2 of the memory - Next the line is specified - and last bit specifies which byte

Answer 118

2^32 = 4 gigabytes

Answer 119

a) i b) ii c) iv d) iii

Answer 120

d-1 errors = 5

Answer 121

D) C latch E) A latch F) B latch

Answer 122

t = 5 for each full adder (IT SAYS EACH WILL CHANGE AT T=5, NOT APPLIED AT T=5), +1 OR gates inbetween (5+1) + (5+ 1) + 5 = 17 - 17 not 18 because final OR isnt used.

Answer 123

- Look ahead carry adds **time delay per bit** (**8 bits, 8 delay**) - Ripple carry **adds time delay per gate for first cycle** (but when in a cycle, each cycle after the first only counts as 1 addition to time delay (you must also consider connecting OR Gates within the diagram as adding a unit) , e.g.,**8 bits, 20 delay** for sum (6 + 14(7 x2 - 7 or GATES AND 7 ripples))

Answer 124

1st adder 5 units, 20 + 5 = 25 (applied) then each following adder adds 2 bit of time delay by default assumption. (1 for each s1, s2 ect, 1 for the OR gates connecting the ripple) 2 x 7 = 14 14 + 25 = 39

Answer 125

100 + 16 bits = 116 - Two units of time delay for gates dont matter as LAC only considers bits.

Answer 126

carryout = 5t sum = 6t

Answer 127

- **Ripple Carry**: First cycle is normal amount (e.g. 6 for sum, 5 for carry) but then +1time delay for each cycle (+ time delay of any gates e.g. OR gates connecting the ripple) - **Look ahead**: just adds on the number of bits to the time delay

Answer 128

CISC, Complex Instruction Set Computing (e.g. AMD/intel) - Memory in 70s was expensive. - Tring to keep programs short (limit the number of microinstructions). - CPU designers built more functionality into individual machine instructions, a trend that later became known as CISC. - CISC instruction set was implemented in firmware by large microprogram stores. - However, a lot of compilers ignored most complex instructions for reasons of portability.

Answer 129

RISC: Reduced Instruction Set Computing (e.g. ARM, RISC-5) - Memory price is less expensive. - Performance became more important than short programs. - Designers noticed that only 20% of instructions were run 80% of the time so they focused their effort on making the 20% run very quickly – techniques such as pipelining.

Answer 130

ARM: - Initially stood for Acorn RISC Machine, April 1985, ARM1 - ARM2 came later, added multiplication hardware. - 1990s, new company ARM, Advanced RISC Machines ARM business model: - They licensed their intellectual property. - Sold rights to their designs to semiconductor companies. - ARM company now licenses IP blocks such as ALU, CPU and memory. - Have specification documents to define how compliant products must behave

Answer 131

1. **Application profile** aimed at high performance processes capable of running fully featured operating systems. 2. **Real-time profile** defines an architecture aimed at systems that require deterministic timing and low interrupt latency. 3. **Microcontroller profile** defines an architecture aimed at low-cost systems, where low latency interrupt processing is vital.

Answer 132

System on Chip (Soc): - Most basic computer entity, billions of transistors, libraries of blocks - Semiconductor companies license ARM blocks, and add other parts to create a SoC, usually import an operating system. - SoCs using IP blocks reduce time to market significantly.

Answer 133

The piece of hardware is hidden, running software to perform specific tasks such as TV set top box or MP3 player.

Answer 134

Cortext-A53 is used to build hardware for the raspberry pi.

Answer 135

- Armv8 provides a 64 bit instruction set - Armv7 provides 32 bit instruction set, ARMV8 can support this however

Answer 136

1. Quad(4 processors) cortex architecture 2. USB connector 3. Memory/RAM 4. Card Reader 5. DC to UD converter

Answer 137

1. L1 Data Cache 2. L1 Instruction Cache 3. CPU, processor 4. 4 cores 5. shared unified L2 cache

Answer 138

1. 32 KB instruction cache, A-B associative, 2^5 megabytes + 2^10 (bytes) = 2^15bytes 2. Instruction of 1 byte size fetched and put here, 4 instructions (8 bit each) 3. Instructions are decoded here 4. Instructions are broken up into atomic instructions – Micro Ops (4 uOps).These are scheduled in another queue. 5. For numerical operations up to 2 micro uOps can execute as there are two places 6. There are two for F-/Neon 7. There are 3 for loading/storing 8. This is for Boolean operations/Other 9. Registers for results 10. 32kb data cache. 11. Translation lookaside buffer - translation for physical -> virtual addresses 12. Both cache types come from the larger l2 cache 13. Program counters, point to next instruction 14. Scheduling thread for instructions 15. Branch predictors guesses whether the conditional branch will be holding or not. 16. Retirement order buffer, makes more performant via stitching out of order instructions

Answer 139

The translation table used by MMU is stored in main memory 1. The MMU maintains an L1 cache of recently accessed page translations in a Translation Lookaside Buffer (TLB) 2. Each TLB entry contains not just physical and Virtual Addresses, but also attributes such as memory type, cache policies, access permissions etc. 3. If TLB does not contain a valid translation for the Virtual Address issued by a core (a TLB miss) a translation table lookup is performed using the Table Walk Unit

Answer 140

* The MMU uses the most significant bits of the Virtual Addresses of code and data to index entries in a translation table which contains the Physical Addresses * The translation is carried out automatically in hardware and is transparent to the application * In addition to address translation, the MMU controls memory access permissions, memory ordering, and cache policies for each region of physical memory

Answer 141

- The Cortex-A53 processor protects against soft errors that result in a cache RAM bitcell temporarily holding the incorrect value - The Cortex-A53 CPU cache protection support has a minimal performance impact when no errors are present - When an error is detected, the access that caused the error is stalled while the correction takes place - If the error cannot be corrected (failed memory) that way is never used again and the data is fetched from the next level cache or from main memory

Answer 142

- Some RAMs have Single Error Detect (SED) capability (Hamming distance of 2),

Answer 143

- Single Error Correct, Double Error Detect (SECDED) capability (Hamming distance of 3)

Answer 144

All g = t+1 all p = t+3

Answer 145

S1 instruction fetch S2 instruction decode S3 operation fetch S4 instruction execute S5 Write back

Answer 146

2^0 = 1 2^1 = 2 2^2 = 4 2^ 3 = 8 2^4 = 16 2^5 = 32 2^6 = 64 2^7 = 128

Semester 1 Flashcards

(192 cards)