Chapter 4: Numerical Descriptive Techniques Flashcards
Measures of central location
Arithmetic mean (mean, average)
Median
Mode
Arithmetic mean
Aka mean or average
Sum of all observations / total number of observations
Essentially same calculation for sample and population
(Average function in Excel)
Usually first selection of central location but can be sensitive to extreme outlier values
Only functional for interval data
Median
Observations that falls in the middle of a list of observations places in order
If even number of observations then median determined by averaging two middle observations
Same calculation for sample and population
Median function in Excel
Often a better function of center than mean if there are a small number of extreme outlier observations
50% of observations are above and 50% below
Useful for interval AND ordinal data
Mode
The observation (or observations) that occur with the greatest frequency
Sample and population calculated the same way
For larger samples and populations modal class may make more sense than a single mode value
Not great for small samples, potentially not unique
Mode function in Excel
- If multiple Excel returns smallest mode without indicating alternatives
Can be used for any type of data (interval, ordinal, nominal)
Using Excel to calculate multiple statistics
Data Data analysis Descriptive statistics Select input range Summary statistics
Measures of variability
Range
Variance
Standard deviation
Coefficient of variation
Range
= largest observation - smallest observation
No information about observations in between
Variance
Average deviation from the mean squared
- calculate mean
- find the difference (deviation) of each observation from the mean
- square each deviance and sum them together
- divide that by 1 less than the number of observations (this corrects for the mean observation)
- results in variance ^2
Excel: use VAR function
Mostly useful for comparing multiple sets of data
Shortcut method for variance
S^2 = (1/n-1) x (sum of all observations squared - (sum of all observations/number of observations))
Standard deviation
Average deviation from the mean
Square root of the variance
Measure of consistency
Empirical rule for interpreting standard deviation
If histogram of observations is bell shaped (symmetrical and unimodal) then:
- approx 68% of all observations fall within one standard deviation of the mean
- approx 95% of all observations fall within two standard deviations of the mean
- approx 99.7% of all observations fall within three standard deviations of the mean
Chebysheff’s theorem
The proportion of observations in any sample or population that lie within k standard deviations of the mean is:
1 - (1/k^2) for k>1
Provides the lower bound of proportions in an interval
Can be used when the empirical rule does not apply (non bell shaped histograms)
Can be used when empir
Coefficient of variation
The standard deviation of the observations divided by the mean
Indicates if standard deviation is large or small given the observation set
Measures of relative standing
Provide information about the position of particular values relative to the entire data set.
Percentile
Quartiles
(Quintiles, deciles)
Interquartile range
Percentile
The Pth percentile is the value for which P% are less than the value and (100 - P)% are greater than the value
Use to describe a single set of interval or ordinal data to communicate relative standing