CSO3 PERFORMANCE Flashcards
PROGRAM PERFORMANCE
performance-cost trade-off is important
- initial program vs compilation vs way of execution
- high level -> machine code -> hardware
-hardware- software connection
-processor and memory types (static/dynamic ram, cache, hard drive)
about the program:
-algorithm: number of processes
-programming language/compiler/architecture: computer instructions per process
-processor and memory: execution speed
-i/o system(including OS): i/o processes speed
DEFINITIONS
RESPONSE TIME(elapsed time/wall clock time/latency/execution time):
-> concerns a process
-instruction completion time
-query wait time
-memory accessing, OS latency, CPU-i/o execution
THROUGHPUT
->instructions per time frame
-execution rate
-big number of instructions
CPU TIME
-> CPU time = program time + OS time <= elapsed time
PERFORMANCE: 1/exec time
SPEEDUP:
-b is n times faster than a
speedup(b to a) = exectime(a)/exectime(b)=n
-> b=a/n
PERFORMANCE
decrease response time <=> increase throughput
-faster CPU -> dec response time, inc throughput
-more CPUs to multitask ->inc throughput , dec response time
COMPUTATION OFFLOADING
cloud, fog, edge computing
when should we offload processes from device to server
- lessen response time
- minimize device power consumption
CPU EXCECUTION TIME
CPU time =clock cycles * clock cycle time = clock cycles/clock frequency
improve performance by:
-using less clock cycles(change the instructions)
-increase clock frequency
INSTRUCTIONS AND CPI
(cycles per instruction)
clock cycles = instruction * cpi
cpu time = instructions * cpi * cycle time *= (instructions * cpi)/clock frequency
-instruction count: depends on program, ISA and compiler
-cpi: depends on CPU hardware
cpi is different for each instruction
-clock cycles= Sum{i=1-n}(cpi(i) * instructions(i))
weighted average cpi: cpi = clock cycles/instruction count = sum{i=1-n}(cpi(i) * relative frequency of i-type instructions)
relative frequency: i-type count/instruction count
PERFORMANCE FORMULAS
secs/program= instructions/program * cycles/instruction (average cpi) * seconds/cycle
–instructions != lines of code
==executed/dynamic instructions (example: if conditions that are never met are not dynamic instructions)
–seconds/cycle == clock period == 1/clock frequency
AVERAGE CPI
is different for every program
CPI ave= sum{i=1-n}(CPIi * ICi)
-ICi = % of i-type instructions
WHAT PLAYS A ROLE IN PERFORMANCE
CPU time = instruction count * CPI * clock cycle
instruction count is affected by : algorithm, programming language, compiler, ISA
CPI is affected by: algorithm, programming language, compiler, ISA
clock cycle is affected by: ISA, hardware
EFFECT OF PARTIAL TO TOTAL SPEEDUP
t(total)= t(stable) + t(with speedup)/speedup
AMDAHL’S LAW
theoretical speedup when a part of the system is improved:
S-> total speedup
p-> % of improved part
s-> speedup of improved part
S = 1/(1-p + p/s)
explanation:
t-> initial exec time
S=Time before speedup/Time after speedup=
=(t(1-p) + tp)/(t(1-p) +(tp)/s)=
=(1-p+p)/(1-p+p/s)=
=1/(1-p + p/s)
for change to n multiprocessors replace s with n in the formula
—note: when n reaches to infinity S=1/(1-p)
-conclusion 1:
performance depends on slow part of process
-conclusion 2:
for multiprocessor speedup it is limited by the serial part
POWER CONSUMPTION AND PERFORMANCE
P(ower) = C(apacity) * V(oltage)^2 * clock F(requency)
P=C * V^2 * F
power is energy consumption rate which is energy /time
optimization options:
if we cannot lower the voltage (V) or dissipate more heat (clock frequency)
we optimize with multicore and parallel programming(split capacitive load)
power consumption and heat dissipation are overall limiting factors to optimal performance ( esp heat which can melt the processor)
MULTICORE MICROPROCESSORS
instruction parallelism vs parallel programming
parallelism is done without the programmer’s involvement
parallel programming needs explicit instructions for:
–scheduling
–load balancing
–optimized communication and synchronizing between cores
COMPUTATIONAL POWER
MIPS -> million instructions per second
is vague there are multiple types of instructions with varying processing needs
MFLOPS -> million floating point operations per second
usually that, is more clear (float operations are some of the more demanding ones
! MIPS PERFORMANCE !
million instructions per second
doesnt take into account the different ISAs and the complexity difference of each instruction
MIPS=instruction count/exec time * 10^6
==clock frequency/cpi* 10^6
but there are varying cpis