Benchmarking Flashcards
What are the differences between real and synthetic workload?
Real workload:
List of requests observed during normal operation of a real system.
This is uncontrollable and harder to reproduce.
Synthetic workload:
List of requested used for controlled performance testing.
This is repeatable and controllable, it should represent the real workload.
What should be taken in mind, when selecting a workload?
Services Exercised Level of detail Representativeness Timeliness Loading level Repeatability ...
What is the definition of a unidirectional effect?
Effects that only increase as the level of a factor increases, or vice-versa. Or else, it is impossible to retrieve consistent information about that factor.
What is the main difference between a simple design, varying one factor at a time, and a full factorial design?
When using a simple design we can’t capture interactions between factors, which is unrealistic. A full factorial design, on the other hand, computes every possible combination of levels in each factor.
What is the main goal of performing a 2k-p Fractional Factorial Design?
The main goal is to reduce the number of experiments done using a Full Factorial Design. This way, it is possible to run the experiments with several factors confounded. This can be used to test if the interactions of some factors is indeed negligible without having to compute all the combinations. The main side effect is not being able to determine the effect of all factors individually.
How should confoundings be chosen?
Ideally, we should choose significant effects with insignificant ones.
What are the most common mistakes when designing plots?
▪ Excess information ▪ Multiple scales ▪ Using symbols in place of text ▪ Poor scales ▪ Using lines incorrectly ▪ Non-zero origins ▪ Three quarters rule: Highest point should be ¾ of scale (or more) ▪ Two related measures on the same graph ▪ Omitting confidence intervals ▪ Histogram cell size ▪ CDF vs histogram to compare several data sets
How can we measure the impact of a factor in a given system after computing the sign table?
Now it is important to quantify the impact of each factor and their interaction. This is measured by the proportion of the variance.
What is a test workload?
List of requests used to analyse the performance of a SUT. Can be either a Real Workload or a Synthetic Workload.
Real workload typically cannot be repeated, and therefore, is generally not suitable for use as a test workload.
What are application benchmarks used for?
Application benchmarks or macro-benchmarks are used to evaluate the performance of a System Under Test (SUT) as a whole
What are exercisers used for?
Exercisers or micro benchmarks are used to evaluate a specific Component Under Test (CUT)
What is a benchmark model composed of?
Metrics -> Criteria used to evaluate the performance of the system
Factors -> Parameters that are varied in the performance study
Levels -> Values taken by each factor
What is the goal of a designing a proper experiment?
Run the least number of experiments that allow for strong conclusions
What is a systematic service characterization?
- Identify service provided by major subsystem
- List factors affecting performance
- List metrics that quantify demands and performance
- Identify workload provided to that service
Describe the 3 averaging techniques to characterize workload parameters
Mean -> More affected by outliers than median or mode
Median -> Sort the observations in increasing order, take the observation in the middle of the series. More resistant to outliers
Mode -> Plot histogram of observations. Choose the midpoint of the bucket where the histogram peaks. For categorical variables, the most frequently occurring