Chapter 2 Flashcards
Definition of a Measure of Location
A measure of location describes the position of a data value in a data set.
Examples include the mean, median, mode, percentiles, and deciles.
Definition of a Measure of Location
A measure of location describes the position of a data value in a data set.
Examples include the mean, median, mode, percentiles, and deciles.
Mean (x̄) Formula
The mean is the sum of all data values divided by the number of data values.
Formula for the Mean:
x̄ = (∑x) / n
If data is in a frequency table:
x̄ = (∑fx) / (∑f)
Median
The middle value when data is ordered.
For discrete data, the median is the (n + 1)/2th value.
For grouped data, estimate using interpolation
Mode/Modal Class
The mode is the most frequently occurring value.
If data is grouped, the modal class is the class interval with the highest frequency.
Choosing the Best Measure
Mean: Uses all data, affected by outliers.
Median: Good for skewed data, not affected by outliers.
Mode: Used for categorical data, but not always useful.
Percentiles and Quartiles
Percentiles split data into 100 equal parts.
Quartiles split data into 4 equal parts:
Lower quartile (Q₁): ¼(n)th value
Median (Q₂): ½(n)th value
Upper quartile (Q₃): ¾(n)th value
Finding Quartiles for Discrete Data
If n/4 is a whole number, take the midpoint of this data point and the next.
If n/4 is not a whole number, round up.
Finding Quartiles for Grouped Data
Use interpolation assuming data is evenly distributed within classes.
Interpolation formula
Estimated value = L + (( (n / k) - F ) / f ) × c
Where:
L = lower boundary of the class containing the percentile or quartile
n = total cumulative frequency
k = the fraction representing the desired position (e.g., for median, k = 2, for quartiles k = 4, for percentiles k = 100)
F = cumulative frequency before the class
f = frequency of the class
c = class width (upper boundary - lower boundary)
Interquartile Range (IQR)
Interquartile Range (IQR) Formula:
IQR = Q3 - Q1
- Measures the spread of the middle 50% of the data.
- Not affected by extreme values (outliers).
- Helps compare variability between data sets.
Interpercentile Range
The difference between two given percentiles (e.g., 10th to 90th).
Often preferred over the range since it excludes extremes.
Variance (σ²)
Variance Formula (Raw Data):
σ² = (∑x² / n) - (∑x / n)²
Standard Deviation (Raw Data):
σ = √σ²
Variance (Grouped Data):
σ² = (∑fx² / ∑f) - (∑fx / ∑f)²
Standard Deviation (Grouped Data):
σ = √σ²
Coding for Simplified Calculations
Coding transforms data for easier calculations:
y = (x - a) / b
New mean:
ȳ = (x̄ - a) / b
New standard deviation:
σᵧ = σₓ / b
Where:
a and b are constants.
Coding shifts and scales data, affecting the mean but not adding/subtracting constants to standard deviation.