Performance Modelling - Week 2 Flashcards
Performance Issues
Often some performance target needs to be met:
- Response time (completion within some time)
Might need to meet expected user demand:
- Is there sufficient storage? Network bandwidth?
Not a good idea to build distributed system and then realise that it fails to meet demand or is too slow: Trial and error is a bad idea!
Some solutions trade performance for something else:
XML is good for inter-operability but it increase the size of communication messages!
What is a performance model?
A model is a representation of a system that captures the most relevant characteristics of the system under study
It’s used to analyse the behaviour of a system under different circumstances
An abstraction of the reality that allows us to study the performance-related characteristics of a system.
Performance models are used to answer what-if questions as opposed to direct changes in the production environment.
Don’t confuse performance modelling with performance tuning.
Performance tuning
For a given application tries to identify performance bottlenecks and improve performance
Performance Modelling approaches
Analytical Model
- Derive formulas, sometimes simple, usual it is not
- There is a branch of mathematics dealing with the properties of systems with queues, it is out of scope for this course.
Simulation
- Monte-Carlo simulations (random numbers are used)
- Discrete-event simulations (events over time are considered)
Monte-carlo simulations
write software which describes the system we want to look at, and generate random numbers to see how the system behaves
Analytical model examples
Minimum possible http transaction time:
t = RTT + t(request) + t(siteprocessing) + t(reply) where RTT is round trip time, t(request) is requestsize/bandwidth, t(siteprocessing) is the time for processing and t(reply) is replysize/bandwidth
transfer_time = latency + size/bandwidth
Parallel computation
Amdahl’s Law: the running time, tp, of a programme running on p CPUs, which has an inherently sequential part 1-x and when running on one machine runs in time ts:
tp = ts*(1-x + x/p)
A more realistic version is max(critical_path_workload, total_workload/p)
Transfer time formula
transfer_time = latency + size/bandwidth
Amdahl’s Law
the running time, tp, of a programme running on p CPUs, which has an inherently sequential part 1-x and when running on one machine runs in time ts:
tp = ts*(1-x + x/p)
A more realistic version is max(critical_path_workload, total_workload/p)
There is some overhead to divide the request between machines
Queuing Theory
- Based on applied probability theory
- it studies the behaviour of systems with service counters to which customers arrive and join a waiting line (queue).
Parameters of the problem:- Arrival rate
- Number of servers
- Processing time
- …
Key Question: What is the average waiting time?
Interested in steady-state solutions
Little’s Law
Steady State - arrival rate is similar to departure rate of completed work
In the steady state:
Avg_no_of_customers_in = arrival_rate x service_time
Example:
arrival rate: 3 seconds /sec x 1 sec = 3 request in the system at any point in time. Steady-state implies 3 requests / sec departure rate, hence, more than one server
Monte-carlo methods
Calculate results based on random sampling
1. Generate random input
2. Simulate what follows and check the result
Useful for computing a result after a large number of samples.
Example: estimate the value of pi using random numbers.
Discrete event simulation
Events are randomly generated
- The simulation proceeds as a chronological sequence of events
- Implies some notion of time in the simulation
- Many computer games are discrete event simulations
Performance modelling
useful to understand performance related properties of distributed systems.
Performance related properties may relate to:
- response time
- throughput
- storage requirements
- CPU Capacity
- etc…
Common approaches:
- Analytical modelling
- Simulation
Performance Targets
Response Time
- When many machines are involved: speedup
Throughput (work per unit of time)
Resource utilisation (think of a cloud provider)
Energy Consumption
- Relates to CPU frequency, application profile
Minimize cost (when using cloud resources)
Minimize number of failures (fault tolerance)
Trade computation with communication
- If we compute everything on 1 computer there’s no communication, if we split the load with another computer computation is reduced but communication is increased. This is a typical trade-off.
Communication vs Computation
Given:
Workload - 24gb
Processing time - 1 second for 1 gb
Transfer time - 4 sec for 1 gb
We’d need 24 seconds to run on one machine.
If we split up to more machines we may save some time but the network will be unpredictable, may consume more energy and be prone to failure. There is a trade-off!
It has been argues that there is a need “to design algorithms that communicate as little as possible”