lecture 9 Flashcards
describe z score heights example
xbar = 72inch s = 33.5inch
for x = 80
z score = 80-72/3.5 ~ 2.29
so height is 2.29 s.ds above avg
for x = 68.5
z score = 68.5-72/3.5 = -1
= height is one standard dev below avg
actual scale of obs does not matter
Unit less - always on same scale
how do empirical rule and z score relate to each other
if have roughly symmetric mound shape distribution =
~68% of z scores will lie between -1 and 1
~95% of z scores will lie between -2 and 2
~99.7% of z scores will lie between -3 and 3
expect repulses if empirical rule holds, should be in these ranges if symmetric
what are box plots
graphical tool for describing quantitative data and is useful for assessing centre, spread, skewedness and outliers
describe boxplots - sample quartiles
generalization of median
1st sample quartile = number ql, such that 25% of obs are =< ql — LOWER quartile
2nd sample quartile = median m, such that 50% of obs are =< m
3rd sample quartile = number qu, such that 75% of obs are =< qu — UPPER quartile
describe boxplots - sample quantiles
in general for 0=<p=<1, the pth sample quantile is the number qp such that (100xp)% of obs are =<qp
= ql=q0.25
m=q0.5
qu=q0.75
describe boxplots - sample interquartile range gen
IQR = distance between lower and upper quartiles
IQR=qu-ql
better than must range since that can be affected by outliers
more informative summary of spread
less sensitive to outliers than variance or s.d = more robus
describe basic components of boxplots
vary slightly between softwares but basic components =
median line
box (lower to upper quartile = IQR) or lower and upper hinge (of box)
fences and whiskers
name the 2 forms of fences of boxplots
inner fences
outer fences
how to calculate inner fences
upper inner fence = qu+1.5 x IQR
lower inner fence = upper inner fence = ql-1.5 x IQR
basically upper quartile and up a bit and lower quartile and down a bit
how to calculate outer fences
upper inner fence = qu+3 x IQR
lower inner fence = upper inner fence = ql-3x IQR
just goes out further
does not appear on plot
describe boxplot construction
whiskers = placed at points less extreme than inner fences
extend to most extreme data point which is more than 1/5 times length of box away from box
line that extends from box to data point = last data point before hitting inner fence
how to deal with outliers on boxplot
outliers that fall outside of whiskers plotted
marked by standard point symbols
describe boxplots - graphs
capture where high proportion of data lie
can plot vertically or horizontally
show outliers
show inner fences
hinges
median
Whiskers
Summarizes using quantiles (lower and upper quartiles, median)
good for skewed data
how to interpret boxplots
examine length of box to asses variability in sample = IQR is measure of variability
Identification of outliers is straightforward
Especially useful for comparing 2 samples - features of samples
how do boxplots show skewed data
median halfway through upper and lower quartile = symmetry
if not in middle = skewed
if one whiskers extends further than other