Performance Modelling - Week 2 Flashcards

1
Q

Performance Issues

A

Often some performance target needs to be met:
- Response time (completion within some time)

Might need to meet expected user demand:
- Is there sufficient storage? Network bandwidth?

Not a good idea to build distributed system and then realise that it fails to meet demand or is too slow: Trial and error is a bad idea!

Some solutions trade performance for something else:
XML is good for inter-operability but it increase the size of communication messages!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a performance model?

A

A model is a representation of a system that captures the most relevant characteristics of the system under study

It’s used to analyse the behaviour of a system under different circumstances

An abstraction of the reality that allows us to study the performance-related characteristics of a system.

Performance models are used to answer what-if questions as opposed to direct changes in the production environment.

Don’t confuse performance modelling with performance tuning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Performance tuning

A

For a given application tries to identify performance bottlenecks and improve performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Performance Modelling approaches

A

Analytical Model
- Derive formulas, sometimes simple, usual it is not
- There is a branch of mathematics dealing with the properties of systems with queues, it is out of scope for this course.

Simulation
- Monte-Carlo simulations (random numbers are used)
- Discrete-event simulations (events over time are considered)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Monte-carlo simulations

A

write software which describes the system we want to look at, and generate random numbers to see how the system behaves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Analytical model examples

A

Minimum possible http transaction time:
t = RTT + t(request) + t(siteprocessing) + t(reply) where RTT is round trip time, t(request) is requestsize/bandwidth, t(siteprocessing) is the time for processing and t(reply) is replysize/bandwidth

transfer_time = latency + size/bandwidth

Parallel computation
Amdahl’s Law: the running time, tp, of a programme running on p CPUs, which has an inherently sequential part 1-x and when running on one machine runs in time ts:
tp = ts*(1-x + x/p)

A more realistic version is max(critical_path_workload, total_workload/p)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Transfer time formula

A

transfer_time = latency + size/bandwidth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Amdahl’s Law

A

the running time, tp, of a programme running on p CPUs, which has an inherently sequential part 1-x and when running on one machine runs in time ts:
tp = ts*(1-x + x/p)

A more realistic version is max(critical_path_workload, total_workload/p)

There is some overhead to divide the request between machines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Queuing Theory

A
  • Based on applied probability theory
  • it studies the behaviour of systems with service counters to which customers arrive and join a waiting line (queue).
    Parameters of the problem:
    • Arrival rate
    • Number of servers
    • Processing time

Key Question: What is the average waiting time?
Interested in steady-state solutions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Little’s Law

A

Steady State - arrival rate is similar to departure rate of completed work

In the steady state:
Avg_no_of_customers_in = arrival_rate x service_time

Example:
arrival rate: 3 seconds /sec x 1 sec = 3 request in the system at any point in time. Steady-state implies 3 requests / sec departure rate, hence, more than one server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Monte-carlo methods

A

Calculate results based on random sampling
1. Generate random input
2. Simulate what follows and check the result

Useful for computing a result after a large number of samples.
Example: estimate the value of pi using random numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Discrete event simulation

A

Events are randomly generated
- The simulation proceeds as a chronological sequence of events
- Implies some notion of time in the simulation
- Many computer games are discrete event simulations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Performance modelling

A

useful to understand performance related properties of distributed systems.

Performance related properties may relate to:
- response time
- throughput
- storage requirements
- CPU Capacity
- etc…

Common approaches:
- Analytical modelling
- Simulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Performance Targets

A

Response Time
- When many machines are involved: speedup

Throughput (work per unit of time)

Resource utilisation (think of a cloud provider)

Energy Consumption
- Relates to CPU frequency, application profile

Minimize cost (when using cloud resources)

Minimize number of failures (fault tolerance)

Trade computation with communication
- If we compute everything on 1 computer there’s no communication, if we split the load with another computer computation is reduced but communication is increased. This is a typical trade-off.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Communication vs Computation

A

Given:
Workload - 24gb
Processing time - 1 second for 1 gb
Transfer time - 4 sec for 1 gb

We’d need 24 seconds to run on one machine.

If we split up to more machines we may save some time but the network will be unpredictable, may consume more energy and be prone to failure. There is a trade-off!

It has been argues that there is a need “to design algorithms that communicate as little as possible”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly