Ch. 5: Graphs and Stats Flashcards
Central Tendency
Used to summarize data with 1 number
1) Mean
2) Median
3) Mode
Measure of variability
Uses two numbers to summarize the data
1) IQR (Interquartile Range Q3-Q1)
2) Range (Max-Min)
3) Sum of squared deviations (x-mean) squared
4) Sample variance (SD-mean) squared/Total # observations -1
Percentile
The position relative (compared to) others
Provides info about values relative to entrie data set
Suppose you scored in the 60th percentile on the GMAT. That means 60% of the other scores were below yours, while 40% of scores were above yours.
Box Plots
Min 25th percentile (Q1) Median (50th percentile, Q2) 75th percentile (Q3) Max
Interquartile Range
=Q3-Q1
Insensitive to outliers
Measures spread of middle 50%
Values that are far apart increase variability
Range
=largest observation-smallest observation (Max-Min)
The simplest measure of variability
Sensitive to outliers
Better to use all the data and not just two points
Sum of squared deviations
=(SD-mean) squared
Deviation=how far it is from the mean
Shows how much the data moves around versus staying in one place
Sample variance
=(SD-mean) squared/total # observations - 1
(Sum of sqaured deviations)/total # observations - 1
Standard deviation
=squared root of variance radical s squared Shows how tighly clustered the data is Calculated by: 1) Empirical rule 2) Coefficient of variation 3) Z-score
Empirical rule
Only for bell shaped histograms
68% fall within 1 Standard deviation lower=mean -SD upper=mean plus SD
95% fall within 2 SD lower= mean-2SD
99.7% fall within 3 SD
Coefficient of variation (c of v)
=SD/mean
Z score
= (x-mean)/SD
Shows how many SD’s we are from the mean