Introduction to computer design Flashcards
What is Moore’s law?
The number of transistors double every 18 months.
However, this is now slowing down.
Define the internet of things/embedded systems
Computers that are integrated as components in systems. Have very few applications and are not programmamble by the end user.
(cars, washing machine, …)
Power, performance and costs set the constraints for embedded systems
What are personal mobile devices?
Wireless devices with multimedia user interfaces.
As embedded systems, they are power, performance and cost constrained
(phones, tablet, …)
What are desktop computers?
General purpose running a variety of software.
Limited by power, but not as much because of the ability to an outlet.
Performance constrained
What are servers?
Network based computers.
Have high throughput and need to be reliable.
High capacity.
Energy constrained.
What is one way to improve computer performance?
Exploiting parallellism
What two types of parallelism are there?
Data-level parallelism: Same operation of multiple data
Task-level parallelism: Divide the work into tasks that can operate mostly in parallell
What are the sources of parallelism in computers(5)?
Instruction-level parallelism
Thread-level parallelism
Request-level parallelism
Memory-level parallelism
Vector processing (SIMD - single instruction multiple data)
What are Instruction level paralellism?
Observing that if instructions are independent of each other, they can be executed in parallell.
What is vector processing / SIMD (Single input multiple data)
You operate the same instruction on multiple data
What is thread level parallelism?
Software is organized as different threads that can run in parallell.
Threads can communicate with each other.
What type of parallelism can be implemented in request-respons systems?
Request-level parallelism
What is request-level parallelism?
In these systems, such as servers, requests tend to be independent of each other.
Request-level parallelism is when systems processes requests in parallell.
What type of parallelism can be implemented for load and store requests?
Memory-level parallelism.
What is Memory-level parallelism?
Processor memory requests can be issued in parallell to hide memory latencies
Name the categories in Flynn’s taxonomy
SISD: Single instruction, single data (sequential)
SIMD: Single input, multiple data (Vector architectures, GPU)
MISD: Multiple instructions, single data (not commonly used, very impractical, more theoretical)
MIMD: Multiple instruction multiple data
What are the two categories of MIMD systems?
Tightly-coupled: Multi-cores, many-cores
Loosely-coupled: Clusters, data centers
Define computer architecture
The instruction set architecture (ISA).
The set of instructions that can be executed by the hardware.
Todays objective of computer architecture is to design the organization and the hardware to meet objectives and functional requirements.
Final definition: Computer architeture covers the ISA, the organization and the hardware
What does the ISA define (7)?
Operand types (registers, memory locations)
Memory addressing (is the memory byte addressible…?)
Addressing modes (registers, immediate, displacement)
Types and sizes of operands (word, char, double word)
Operations
control flow (conditional branches, unconditional jumps)
Encoding (are all instructions the same size, variable lengths to not waste bits)
What is computer organization?
High-level aspects of a computer’s design.
Memory system, interconnects - how components are connected to each other, CPU design
What is computer hardware?
Logic design and packaging of the computer.
Name the levels of a computer
Application software: Software in high-level language
System software: Compiler (translates HLL to machine code), OS (provides service code, I/O, scheduling, share resources)
Hardware: RAM, CPU, I/O
What are the levels of program code
High-level language:
- Level of abstraction closer to problem domain
- readable for humans (productivity)
Assembly language:
- Textual representation of instructions
Hardware representation:
- stream of bits
- encoded instructions and data
What does a CPU need to be able to do, to compute any function?
arithemtic, conditional branches, memory access
What implements arithmetic in the CPU?
Arithmetic is implemented in the datapath
How is conditional brancing implemented in the CPU?
Control path selects next instruction based on branch outcome. Enables the parts of the datapath needed to execute a given instruction.
How is memory acccess implemented in CPU?
Because of memory latency, memory caches are implementes.
Why did the single-core processor performance growth decrease around 2003?
You could no longer scale up the number of transistors without increasing the power consumption
What is bandwith?
The amount of data that can be transferred each unit of time.
What prevents clock frequency of continuing scaling?
Clock frequency affect power consumption, and in 2003 the power ceiling was reached and frequencies could not scale anymore.
Define the quantitative approach of designing CPUs
Take advantage of parallelism: This is now the way of improving performance
Leverage locality: Utilize temporal and spatial locality in programs (use same data again and again (loop), programs often use data that is close to each other (arrays))
Focus on common case: lot of trade off in computer architecture. Favour the frequent case over the infrequent gives better performance. (Amdahl’slaw)
What is Amdahl’s law?
A mistake that can be made is thinking that improving one aspect of a computer and expecting a proportional improvement in overall performance. This isnot the case.
Massive improvement of an infrequent case gives negligible overall improvement.
T_improved = (T_affected/Improvement_factor) + T_unaffected
What 4 aspects can affect performance?
Algorithm: Determines number of operations executed
Programming language, compiler, architecture: Number of machine instructions executed per operation
Processor and memory system: How fast instructions are executed
I/O systems: Can affect performance. Determines how fast I/O operations are executed.
What metrics are often used to measure performance in computer systems?
Time: response-, turnaround- and execution time,
Throughput
What is turn-around time?
The time from issuing a command to its completion
What is response time?
Time from issuing a command to first response
What is execution time?
The time the processor is busy executing the program.
Does not include the time waitingfor acommand to be executed.
What are the two categories you can divide execution time into?
User execution time: Time spent executing user code
System execution time: Time spent in OS to service the application
What is throughput
Amount of work done per unit time.
When replacing the CPU with a faster one, how is response time and throughput affected?
Both will increase
When adding more CPUs, how is response time and throughput affected?
Response time won’t be affected, but throughput will increase.
How can execution time be measured?
Elapsed time / Wall clock time: Total turnaroun time, including all aspects, this determines the system performance
CPU time: Time spent processing a given job. User/system CPU time. Bias: different programs are affected differently by CPU and system performance.
What is benchmarking?
A way of measuring computer performance.
When benchmarking, you should use real applications. As programs consist of phases, the relative importance of each phase creates bias. Different computer architecture techniques pay off in different phases, and recreating this bias synthetically is difficult. Therefor should be using real applications with real input data.
Should use a collection of real applications. This is because a computer is general purpose and you want to measure how it performs across all of them.
Want realistic applications that pushes the computer to its limits.
What are benchmark suites?
Tools used when doing benchmarking.
The suits attempts to be representative for the workloads of different domains.
What are some benchmarking approaches that you should be careful with, and why?
Kernels, toy programs and synthetic benchmark.
They do not accurately recreate the behaviour of real program.
It is easy to cheat if the benchmarks are simple.
How does CPU clocking work.
The operations of hardware follows a constant clock.
Follows a constant clock frequency (times the clock changes per unit of time, cycles per second).
What is a clock frequency?
Cycles per second
What is the clock period?
Duration of a clock cycle.
Time between a rising edge.
What is often done at the rising edge of the clock.
Update the state (registers, flip flops)
How can you define CPU time using CPU clock?
CPU time = cycles * cycle time
= cycles / clock rate
How can you improve performance when looking at CPU clocking?
Increasing the clock rate or reducing the number of clock cycles
Define clock cycles in terms of instructions
cycles = Instruction count * cycles per instruction
Define CPU time in terms of instruction time
CPU time = Instruction count * CPI (cycle per instruction) * clock cycle time
= (instruction count * CPI) / clock rate
What determines the instruction count
Program, ISA and compiler
What determines the avarage cycles per instructions (CPI)?
CPU hardware.
Different instructions takes different number of cycles. When measuring performance, we therefor need to find an avarage to use.
What is the weighted avarage of CPI?
Weighted by instruction count.
Clock cycles = sum from i = 1 to n, (CPI_i * instruction count_i)
CPI = cycles / count = sum from i = 1 to n, (CPI_i * (count_i/count))
Where count_i/count gives relative frequency
What is the Iron Law and what does it measure?
The Iron Law gives that performance depends on:
- algorithm (affecting count, possibly CPI)
- Programming language (count, CPI)
- Compiler: IC, CPI
- ISA: IC, CPI, Clock cycle time
CPU time = (instructions / program) * (cycles / instruction) * (seconds / cycle)