Workshop 2b - measures of dispersion Flashcards
Spread
A term for how much data values differ from each other, and for how much data values differ from the measure of center.
How to interpret spread
Dotplot. When the dots are the same there is no variation. When the number of dots varies a lot, the variation is big.
Measures of spread
Range
Mean absolute deviation
Variance
Standard deviation
Range
How much the values differ from each other. Difference between the highest values and the lowest value.
R = Maximum - Minimum
Quartiles
A way to get a better insight in how your data is distributed.
Step 1 - Put all values in order
Step 2 - Find the mean
Step 3 - Now find the mean of the first (Q1) and of the second part (Q2).
IQR = Q3 - Q1
Variance
Mean square deviation. Can have flaws
When the spread is high, the variance is also high.
1. Calculate the mean
2. Calculate the distance between the values and the mean. Value - mean. Answer can be negative
3. To get rid of the minus signs, you put the deviation in square root
4. The variance is the mean squared deviation.
Variance = sum of squared / number of observations
Standard deviation
Square root of the variance.
- Calculate the sum of sqaures
- Calculate the variance
- Calculate the standard variation.
Standard deviation is the ‘standard’ of spread
It is mathematically nice to work with numbers with squared differences in optimisation
Squared differences give more emphasis to extreme values
It is easy to interpret because the unit of the standard deviation is the same as the unit of the original variable.
Coefficient of variation
You might want to compare the existing variation between two different things. Can only be used for RATIO
Division by the mean provides a solution.