Measures of Dispersion. Flashcards
The mean can be misleading without some knowledge of _________.
Spread. (Dispersion).
What is the simplest way to measure the variation among a set of values?
The Range.
What is the range?
The distance between the top and bottom values in a data set. (subtract top value from bottom)
What is the most important measure of dispersion?
Standard Deviation.
What are the pros of finding the range?
- Includes extreme values
- Easy to calculate.
What are the cons of the range?
- Can be misleading as can be distorted by extreme values
- Not representative of any features of the values between the extremes.
What is distribution?
The spread of data.
Why is the interquartile and semi-interquartile range better the range?
It is more representative of features of the distribution of values between the extreme’s.
The interquartile and semi-interquartile range focuses specifically on what?
On the Central grouping of values in a set.
The ________ range represents the distance between the 2 values that cut off the bottom and top 25% of values.
Inter-Quartile.
0.25 x (N+1) - what is this the formula for?
To find the position of Q1 (25% / bottom quarter).
0.75 x (N+1) - what is this the formula for?
To find the position of Q3 (75 % / top quarter).
After the quartile positions have been identified, what is done?
Q3 - Q1 = interquartile range.
What is the formula to find out the interquartile range?
Q3 - Q1
What is the semi-interquartile range?
Half of the interquartile range.
Q1 and Q3 are the values cutting off the bottom and top ____ of values.
25%.
Name a pro of the interquartile (and semi-interquartile) range.
It is representative of the central grouping of values in a dataset.
Name a con of the interquartile (and semi-interquartile) range.
It takes no account of extreme values.
Name this:
The difference between a number in a data set and the mean.
Mean Deviation.
If the mean deviation is above the mean, the value is __________.
Positive.
If the mean deviation is below the mean, the value is __________.
Negative.
How is the mean deviation calculated?
x - x bar
aka. number in data set - mean.
What is standard deviation?
A calculation of the mean of all deviation values in a dataset to summarise dispersion in terms of deviation from the mean.
_____ _______ calculates variance.
Standard Deviation.
What is standard deviation known to be?
The most powerful way to measure spread.
In standard deviation, how do we get rid of the negative (sometimes the mean deviations are negative)?
We Square the mean deviation.
Standard Deviation calculates the ______ amount by which scores differ from the _______.
Average, mean.
What is the most accurate way of measuring dispersion from the mean?
Standard Deviation.
What are the steps of standard deviation?
- Find the mean
- Take each deviation value ( x - x bar)
- Square this value
- Add all squares up
- Divide by N-1
- Take the square root of this sum.
What are the pros of standard deviation?
- Takes exact count of all values
- The most sensitive measure
What are the cons of standard deviation?
- Hassle to work out
- It can still be distorted by extreme values.
Name the 3 dispersion measures.
-The Range
The interquartile and semi-interquartile range
-Standard Deviation.
What is the problem with mean deviation?
Often we get 0 because it takes negatives into account.
What data is standard deviation most appropriate for?
Interval data and ratio data.
The standard deviation is associated with the mean. But what do we need to take into account?
Skew of data.
What measure of dispersion should be used for ordinal data?
Interquartile Range and Semi-Interquartile Range.
If data is nominal what does not apply at all?
Dispersion doesn’t apply at all (aka we don’t measure dispersion with nominal data).