Location & Spread Flashcards
How do we represent the mean?
x̄ (x bar)
How do we calculate the mean
The sum of the x values divided by n or the sum of f(x) divided by the sum of f.
How do you find the position of LQ (Q1) within listed/grouped data?
N/4
How do you find the position of the median within listed/grouped data?
N/2
How do you find the position of UQ (Q3) within listed/grouped data?
3n/4
When we find the position of a quartile within listed data, what do we do when the answer is a decimal?
Round up to the next integer
When we find the position of a quartile within listed data, what do we do when the answer is whole?
Find the midpoint with the next number
How do we find percentiles of grouped data e.g the 57th percentile?
0.57 x n
What is a decile?
10% chunks
How do we use linear interpolation to find the median ?
- Find the true class limits and class width (since the data is most often rounded)
- Find the total frequency of the data and divide by 2 to find the nth place of your median
- Identify which group your median is in and calculate how far into the group it is out of the frequency of that group.
- Multiply by the class width
- Add it to the lower bound of the class.
How do we find the interquartile range?
Q3-Q1
What is the advantage of using the interquartile range?
It ignores extremes
How do we find the interpercentile range?
Highest percentile-lowest percentile
What is variance?
A measure of spread that takes all values into account. It is the average squared distance from the mean.
What are the 2 formulas for calculating variance? (Use first formula)
The sum of all values squared divided by n minus the mean squared
OR
‘The mean of the squares minus the square of the mean’ MSMSM
What is standard deviation?
The average distance from the mean.
How do we calculate the standard deviation?
What happens if you add a value to your data set that is within one standard deviation of the mean? i.e is the number within the range of mean + s.d
The standard deviation will decrease.
What is Sxx?
Therefore we can use the expression
Sxx/n to find variance
What is coding?
Applying the same rules to data so it is easier to process. This may or may not change the average, standard deviation etc.
Whar happens to the mean if you have coded your data to be y=ax+b?
It’s affected by both a and b components
Whar happens to the standard deviation if you have coded your data to be y=ax+b?
(It is not affected by the b component)
How are measures of location (e.g mean) and spread (e.g standard deviation) affected by coding?
Measures of location are affected by all parts of coding.
Measures of spread are only affected by multiplicative parts of coding.
What is assumed when we use midpoints to calculate the mean?
The data is uniformly distributed.
What are the advantages and disadvantages of using the mode?
+Useful for non-numerical data
+Not usually affected by outliers
-Doesn’t use all the data
-May not be representsative if it has a low frequency
What are the advantages and disadvantages of using the median?
+Not affected by outliers
+Not significantly affected by errors
-Doesn’t make use of all the data
What are the advantages and disadvantages of using the mean?
+When the data set is very large, a few extreme vaalues have negligible impact
-When the data set is very small, a few extreme vlues have a large impact
What are the advantages and disadvantages of using the range?
+Reflects the full data set
-Distorted by outliers
What are the advantages and disadvantages of using the IQR?
+Not distorted by outliers
-Doesn’t reflect all the data
What are the advantages and disadvantages of using standard deviation?
+When the data set is very large, a few outliers have negligible impact
-When the data set is very small, a few outliers have a large impact