Module 1: Measures of Central Tendency, Dispersion, and Relative Standing Flashcards
Measures of central tendency
Indicator of the CENTER of the data; describes the typical participant or value
Measures:
1. MEAN — The average or typical value; Interval* or ratio* level data; If outlier present, report median and IQR
- MEDIAN — The value that cuts the data in half (50th percentile); Ordinal*, interval or ratio level data (if outliers); Always report IQR
- MODE — The most frequently occurring value; Nominal*, ordinal, interval or ratio level data
Central tendency comparisons: Normal distribution
In a normal distribution, the mean, median, and mode are EQUAL
Symmetry: the two halves of the distribution, folded over in the middle are identical
Kurtosis (peakedness relative to the normal distribution):
- Leptokurtic: very peaked
- Normal
- Platykurtic: flattened distribution
Central tendency: Skewed distribution
In a skewed distribution, the mean is pulled “off center” in the direction of the skew
Positive skew (“left hill”): Mean > Median > Mode
Negative skew (“right hill”): Mean < Median < Mode
Measures of variability/dispersion
The spread of the data in a distribution; two distributions with the same mean could have different dispersions
Reported through 4 mechanisms:
- Range: the difference between the highest (maximum) and lowest (minimum) value in the distribution
- Interquartile range (IQR)
- Standard deviation (SD)
- Variance
Variability:
- HETEROGENEOUS distribution = High variability
- HOMOGENEOUS distribution = Low variability
Interquartile range (IQR)
Always reported with median
Based on quartiles:
- Lower quartile (Q1): Point below which 25% of scores lie
- Upper quartile (Q3): Point below which 75% of scores lie
IQR = Q3 - Q1
i.e. Weights (lbs.):
110 120 130 | 130 140 150 | 150 165 170 | 170 180 190
Q1 = 130, Q3 = 170, therefore IQR = 170 - 130 = 40
Standard deviation
An index that conveys how much, on average, scores in a distribution vary from the mean; or the degree of error of the sample mean
Based on deviation scores (shows how far away from the mean—either above or below—a value is situated); calculated by subtracting the mean from each individual score (x’ = x - x̄)
Computing SD (always positive): Sample: s = Sqrt ∑(x - x̄)^2/n-1 Population: σ = Sqrt ∑(x - μ)^2/n
Advantages:
- Takes all data into account in describing variability
- More stable measure of variability than the range or IQR
- Helpful in interpreting individual scores when data are distributed approximately normally
Disadvantages:
- Can be influenced by extreme scores/outliers
- Not as “intuitive” or as easy to interpret as the range
Variance
An important variability concept in inferential statistics, but NOT used in descriptive statistics
Not easily interpreted because it is not in units of original data; it is in units squared
Variance = SD^2
Computing variance:
s^2 (sample) = ∑(x - x̄)^2/n-1
σ^2 (population) = ∑(x - μ)^2/n
Normal distribution
Bell-shaped symmetric curve; the mean, median, and mode have the same value
The area under the curve = 1
- 68% of the values lie within 1 SD of the mean
- 95% within 2 SD of the mean
- 99% within 3 SD of the mean
**In a STANDARD normal distribution, the mean always equals 0 and SD always equals 1
Relative standing
Measures that can be used to compare values from different data sets, or to compare values within the same data set
While central tendency and variability indexes describe a distribution, there are descriptive statistics that tell us the relative standing or position of a score in a distribution
Two types:
- Standard score
- Percentile rank
Standard score
An index of relative standing of raw scores/values; each value is standardized using mean and SD of the distribution
Z-score = x - x̄/SD
Properties of the Z-score standard normal distribution:
- Symmetrical
- Mean = 0 and SD = 1
- Mean, median, and mode are equal
- Probability ranges from 0 to 1