Architecture II Flashcards

Question

Non-bus networks? | What is bandwidth?

Answer 1

- point to point - mesh networks - bandwidth = lane speed * number of lanes - how much data can be transferred in a time

Answer 2

- size of addressable locations - width of address bus - 32 bit systems have 2^32 amount of memory or 4GB

Answer 3

- MAR hols address of read/write to go on address line - MBR holds data to go on/retrieved from data line - control uni connects them temporarily to the bus and adds the control one signal

Answer 4

- system bus connects the CPU to main memory on the system board - expansion bus allows the CPU o communicate with peripheral devices

Answer 5

- any external device attaches to a computer by a link to an I/O module - appears to the CPU and assembly programmer as an area of read/writable memory just like main RAM - functions include; device communication, control and timing, data buffering and error detection

Answer 6

- transmit commands to the module/device | - send data to the module/device

Answer 7

- read data from the module/device | - read status information from the module/device

Answer 8

- must be able to co-ordinate the flow of data between the internal resources and external devices - devices may be slow so module manages them independent of the CPU - this allows the CPU to go and do other things at the same time - another form of parallelism

Answer 9

- data coming from main memory is sent to a module in a rapid burst - data is then buffered in the module and sent to peripheral device at its data rate due to the transfer rate of external devices being much sower than internal

Answer 10

- report errors to the cpu - mechanical or electrical malfunctions reported by device - unintentional changes to bit pattern as it is transmitted from devices to IO module

Answer 11

- IO modules are hardware connected to the bus - device driver is a piece of software which takes sole responsibility for all communications with the IO modules or with one, or many, devices connected to it

Answer 12

1) CPU asks IO module to check status of device 2) IO module returns device status 3) if ready, CPU requests transfer of data 4) IO module obtains a unit of data from the device 5) Data transferred from IO module to cpu

Answer 13

- cpu is executing a program and encounters an instruction relating to IO operation - cpu executes that instruction by issuing a command to appropriate IO module - IO module performs requested action based on IO command - IO module sets appropriate bits in IO status register - CPU periodically checks status of IO module until it finds that operation is complete

Answer 14

- simple - cpu has direct control - very little hardware support - cpu must periodically poll the module to check status - ties up cpu for long period with no useful work - cpu slowed to the speed of the peripheral

Answer 15

- cpu issues read command - IO module gets data from peripheral whilst CPU does other work - IO module interrupts the CPU - CPU requests data - IO module transfers the data

Answer 16

- fast - efficient - tricky to write - Done by hardware manufacturer - still requires active interaction of cpu for data transfer between memory and IO module

Answer 17

- requires DMA controller on system bus - takes IO communication over from cpu - cpu grants authority to read from or write to memory - relieves cpu to do other things - DMA sends interrupt when task is complete - cpu is only involved at beginning and end of transfer - used for large data movement

Answer 18

- if many devices are connected to the bus - data transfer demand approaches capacity of bus - bus becomes bottleneck - the solution is to use multiple buses laid out in a hierarchy

Answer 19

- Northbridge = fast, connects direct to CPU, RAM, PCIe, GPU - Southbridge = slower, all other IO modules, migrated onto single chip - both are speed bottlenecks

Answer 20

- used to be that each device needed own module, which required its own IRQ line, which required it own IO level driver - painful to set up - today standard lower level bus has a single IO module (USB etc), which has single IRQ and single IO level driver - high speed level drivers then talk to devices directly over this - eg; USB sound card has no direct line to CPU

Answer 21

- peripheral component interconnect express - general purpose for internal peripheral devices (graphics, networking, sound, video capture - replacing old versions - 1 to 32 lane versions - speed versions (2GB-16GBs per lane) - not a bus, actually a mesh network

Answer 22

- ancient/classic point-point comes between 2 devices - not a bus - speed = 75-115200b/s, often 9600b/s

Answer 23

- standard for connecting peripheral devices - not a bus - replaced a variety of of earlier interfaces - different versions and speeds; USB1: 12 Mpbs - USB3 5Gbps - USB hubs, master and slave layout

Answer 24

- USB has 4 physical wires, 5v - bits represented by differential voltage between 2 wires - these wired contain same information mirrored high-low - this makes wires immune to external electrical fields - especially when they are twisted together

Answer 25

- FireWire - competitor to USB - unlike USB is an actual bus - single channel shared by all devices, messages ID by device addresses and controlled for collisions - can daisy chain where as USB cannot - no fixed master, any device can take master role

Answer 26

- old, 1980’s - bus for mass storage devices (hard discs) - pioneered moving logic and work from cpu into IO module - reliable but expensive (used today in servers)

Answer 27

- competitor to SCSI - serial ATA - cheaper and simpler - used in computer systems

Answer 28

- user integrated circuit bus for connecting chips together - common in robotics - two wires (data and clock) - Multi-master, multi-slave - master:generates clock, initiates comes - slave: replies to maser - basic message collision avoidance: only talk if the bus is free

Answer 29

FTDI chip - standard way to access I2C from user code (for robotics) - run serial port over USB in FTDI, convert to I2C

Answer 30

- a vehicle bus is a specialised internal communications network that interconnects components inside a vehicle (train, ship, car etc) - CAN is a true serial bus

Answer 31

- a true bus - multiple PC’s in local area all writing and reading one bus - each message is a frame (contains MAC address) - must avoid collisions - bus is public so everyone can see information on it - modern networks build non-bus features on top of basic bus (mesh and star topologies)

Answer 32

- extremely fast data connection for super computers - connect both within and between computers - connect via Northbridge and PCIe - wire can go out of the box, like ethernet - blurs line between bus and network - violates most of networking theory for speed - not real a bus (mesh network)

Answer 33

- same as ethernet but faster - challenging infiniband for high performance computing - connected via PCIe/Northbridge as opposed to classic ethernet connecting via southbridge - 10GBs ethernet in development

Answer 34

- replacement of parallel with fast, simple serial buses - replacement of buses with mesh network to reduce congestion bottleneck - everything racing to move up from southbridge>Northbridge>CPU silicon

Answer 35

- an entire computer system built into a larger product or another device - work well with small amounts of ram and low processing power - user does not have to deal with complexity, they only deal with the interface - built for a specific task - very time critical (real time OS)

Answer 36

- Automation - consumer electronics - medical electronics - telecom - remotes - military - cars - industrial controls

Answer 37

- average home has 40-50 | - cars have over 60

Answer 38

- keyboards = convert matrix of on/off presses to ASCII-like serial code, then send via USB - optical mouse = optical flow implemented in digital electronics

Answer 39

- focused on one appliation, only the hardware we need for the app - efficiency is very important - need to be fast, reliable, low power and price and small in size

Answer 40

- GP = use hardware, OS and software developed by different companies (ubuntu) - AS = hardware and software are designed together and are matched

Answer 41

- Ubicomp developed by Marc Wiser, Xerox Parc in 1988 - influential cultural movement which applies embedded systems to human interaction - natural interfaces, information visualisation, artistic, architectural and political interactions

Answer 42

- 2010’s opposite of Ubicomp - people are freaked out by losing control - so present attention-focusing, tactile interfaces - if you turn on a light - be fully at one with it

Answer 43

- sensors (input) - Actuators (output) - Signals - analogue/digital conversion and vise versa

Answer 44

- capture physical/chemical analogue quantity and convert to analogue electrical quantity - pressure, force, heat, flow, chemical, sound, vibration, light etc

Answer 45

- put things into action or motion | - radio signals, lights, buzzers, motors, speakers etc

Answer 46

- a controller gives motor commands - driver is a power amplifier (takes low power commands from controller and converts them to high power currents) - motor converts electrical to mechanical power - encoder senses the motor state and feeds this back to controller

Answer 47

- marketing term, just means any processor chip sold for embedded use - typically some system on the chip including; CPU, memory, analog/digital conversion and I/O devices

Answer 48

- everything in one chip, small devices, can fit almost anywhere - lower performance than desktop - cheap - low power consumption meaning less heat and longer battery life - reliable

Answer 49

- much smaller capacities than in PC’s - fast sRAM, volatile, used to store temp data - EEPROM, non-volatile

Answer 50

- timer = measures precisely time intervals or elapsed time, register incremented for every machine cycle - counter = counts number of times particular event, or process occurred with respect to a clock signal. Used to count events happening outside micro-controller - watchdog timer = auto reset of the MCU in case of failure

Answer 51

-8 bit, 16 bit, 32 bit

Answer 52

- transfer data to/from pins of the micro-controller | - digital ports for reading/writing binary values (controlling lights or receiving input from switches)

Answer 53

- external devices (sensors, extra memory) - other systems (host pc) - serial communication - synchronous buses - radio frequency

Answer 54

- open source hardware embedded system | - makes IO and AVR (type of MC) programming easy

Answer 55

- no external memory or bus, small program and memory all on chip - 32 reg - IO goes straight into chip

Answer 56

- brand name of microchip technology corporation in the US - used in large series of microcontrollers of different powers, very popular with engineers - similar to Atmel AVR

Answer 57

- digital signal processors - specialised architectures for handling real time signals - features include; fixed point arithmetic representations and Harvard arch - long pipelines for multiple ALU operations, little branching, branchless loops

Answer 58

-sequence of continuous valued data

Answer 59

- pi = single core ARM11, 32 bit RISC 1GHz, GPU 1/2Gb RAM, USB HDMI - pi2 = ARM cortex-A7 quad cores, 900Mhz, GPU 1GB DRAM, HDMI, Ethernet

Answer 60

- programmable logic controllers - used in industry to control factories - must be indestructible, reliable and simple

Answer 61

- PLA’s - standardised, mass producible logic array - initially everything connected to everything else - customer blows fuses at selected points to produce the circuit they wants - fuses can only be blown once so not re-programmable

Answer 62

- FPGA - evolution of PLA - simple repeated blocks containing a few gates each, standard compute components - potential to connect everything to everything else via interconnects - interconnect states as firmware - reprogrammable/reusable - used for embedded devices and prototyping chips

Answer 63

- theoretical model of computation, not actually implemented - proposed 100 years after analytical engine and does same thing - helped redefine real numbers in pure maths - mis-interpreted as a practical computer design, inflicted SERIAL thinking on us for most of the century

Answer 64

-smaller transistor, shorter critical path, quicker charging/discharging capacitor and faster clock

Answer 65

- P = C * V^2 * F | - power, capacitance, voltage, clock frequency

Answer 66

- transistor count still rising but lock rate flattening sharply - if trend continued the way it was the power and heat outputted would eventually reach the level of the suns surface

Answer 67

- Single instruction multiple data - eg; add 1 to every member of an array - multiple instruction multiple data - eg; processors all doing different things at the same time

Answer 68

- ripple carry is like column addition with carry digit of sum being taken to next column - carry save would be like teams doing the same column addition

Answer 69

- let user write a serial program then optimise at runtime in hardware, optimisations are very local in program space and time - can do multi stage pipeline, eager execution (execute both branches until one reaches a decision) or out of order execution

Answer 70

- use long instruction words (64 bit) to represent multiple data items at once (4 * 16bit numbers of 8*8 bit numbers) - operate on these arrays with single instructions - cray supercomputers 1960’s-70’s

Answer 71

- complex CISC SIMD and dedicated registers | - moved from supercomputers to consumer PC’s

Answer 72

- add specialised very long vector registers in CPU - load them up with data - special institutions and ALU hardware to run SIMD instructions on them all - different from SIMD instructions which pack the data into a single word

Answer 73

CPU -optimised for low-latency computations with large cache and control unit -complex pipelines fewer ALU’s GPU -optimised for data-parallel and high throughput computations -DMA transfer to main RAM -smaller cache and pipelines but thousands of ALU’s

Answer 74

- one host pc - may have multiple compute devices (multiple cards), made of multiple compute units, containing multiple processing elements and maybe some shared registers and cache

Answer 75

- a work group is a group of PE’s within a CU all executing the same instruction, a work item is a processing element execution it on one datum - kernel is a program of SIMD instructions given to a work group

Answer 76

- Very Long Instructions Words - similar to vector arch, multiple instructions on multiple data all in a single word - eg; single assembly instruction may contain information for add an integer in reg A to reg B, store result in C, increment register G and bit-shift register H

Answer 77

- multiple cores on one chip - share same address space - some cache levels are shared, some are not, tricky to get cache writes right

Answer 78

- NUMA arch are single address spaces shared by processors - access times differ according to processor and address - used in supercomputers

Answer 79

-classic supercomputer based on custom chips - 2018 3rd most powerful known computer - arch is quite boring data centre style, 5k standard Intel Xeons = 240k cores with Nvidia GPU cards - could mean an end of supercomputing as a specialist field

Answer 80

- message passing interface - standard function library for processes to send and receive discrete messages to one another - without caring how they are transported (TCP/IP, shared memory...) - found in HCP (high performance computing)

Answer 81

- single program multiple data - hundreds of maintained, identical PC’s - no shared memory, little or no communication between nodes - connected by standard Ethernet, shared network discs - good for data science

Answer 82

- lots of identical managed racked PC’s | - similar to data centre

Answer 83

- like grids but much weaker organisation - lots of consumer grade computer working together, not identical and may be unreliable and untrustworthy - no shared memory and little communication between nodes - connected via public internet - due to unreliability users must consider sending out multiple copies of jobs incase of computer failure

Answer 84

- many cloud computing projects all evolved to use similar structures - split task into chunks (each running same code on different data) - map each chunk to a worker computer - workers send back results (smaller than data), reduce these results into a single result - software tools such as Hadoop aid in this

Answer 85

HPC = simulating physical things in 2d or 3D structure space -structure is replicated in the equations -use MPI to pass messages around it Hadoop = data science, zillions of similar data records, little or no interaction between them, map-reduce jobs divide the work between them

Answer 86

- HPC = experts and professionals with years of training, specialist expensive hardware locked in rooms - hadoop = hackers, cheap consumer hardware and map-reduce

Answer 87

- Spiking Neural Network Architecture - help simulate neutral network of brain - 1 million cores but slow unreliable messages between chips

Answer 88

- Robot operating system - message passing software framework - like MPI but higher level and therefore slower - nodes are arbitrary programs that can run on any computer

Answer 89

- engineers never got hung up on Turing machines and serial programs - instead they work with digital and circuits in which every object is alive - No CPU’s or instructions

Answer 90

- latency = time to solution execution and response time | - throughput = tasks per unit of time bandwidth

Answer 91

- speeds throughput, number of registers - CPU time is the time CPU spends computing a program and does not include time spent waiting for IO - CPU time also means time spent executing lines of code - performance = 1/execution time

Answer 92

- execution time on system B/execution time on system A = n - One system will be n times faster than the other - can also do; performance of system A/performance on system B = n - To work out %; system A is x% faster than B, x = (n-1)*100

Answer 93

- CPU time = (time/cycle)*(cycles/instruction)*(instructions/task) - increase clock rate or reduce number of cycles needed for a task

Answer 94

- Instructions per second = sockets*(cores/socket)*clock*(instructions/cycle) - used to measure non-numerical performance (database, servers, word processing etc)

Answer 95

- floating point operations per second - sockets*(cores/socket)*(cycles/second)*(FLOPS/cycle) - used to measure scientific numerical computing performance

Answer 96

- availability = (total elapsed time - sum of downtime)/total elapsed time - reliability = (total elapsed time - sum of downtime)/number of failures

Answer 97

-formula allows us to work out potential speedup of a program when using multiple processors rather than 1 S = 1/((1-f)+(f/k)) -fraction f is parallelizable -(1-f) is serial -k is the speed up in optimised part of the task -s=overall speedup -when f (parallel processes) is small the use of parallel processors has little effect

Answer 98

- designed to mimic a particular type of workload on a component or system - compares difference between other systems or different hardwares

Answer 99

- synthetic benchmarks are special test programs created to impose workload on component, useful for testing individual components - real world or application benchmarks are based on real world applications and give a much better measure of real world performance

Answer 100

- task type - load conditions = normal/peak loads, stress conditions - time = length of tests - stats - user = gamer, scientist, network admin etc

Answer 101

- WHETSTONE = emphasises math/trig function speed - LINPACK = linear algebra package - DHRYSTONE = string and integer manipulations

Answer 102

- System performance evaluation cooperative - companies that have agreed on a set of real test programs and data - supposed to represent typical application CPU use - valuable indicator of performance

Answer 103

- 1980’s comp builders ruled and pretty much everyone had same CPU’s - since 1990 CPU makers rule with OS defining the platform

Answer 104

- power up = cpu hardwired to start with program counter - BIOS - Bootloader = first user facing program (eg; for OS menus)

Answer 105

- uni-fined extended firmware interface - replaces BIOS conventions - secure boot, boot from larger hard discs (2TB +) - CPU independent

Answer 106

- most successful arch so far at 40 years old - family of CISC arch’s with 16, 32 and 64 bit versions - began in 1968 but first proper one in 1978

Answer 107

- visual display unit - memory mapped, write data direct to video RAM - IO module = VDU chip, reads video RAM, interprets data as current mode - device

Answer 108

- PCIe bus connection to Northbridge - complex IO module - read and execute complex commands sent to its addresses - they are church complete computers themselves

Answer 109

- poses huge security threat to many manufacturers CPU’s - could allow user processes to access each other’s data (to steal passwords, emails or company data) - caused by complex unintended interaction between; VM , cache, CPU kernel mode switch, speculative execution, race condition and indirect addressing

Architecture II Flashcards

(133 cards)