Unit #1 Review Flashcards
What is standard deviation
Average distance to the mean
What is the empirical rule
Mean, 68, 95, 99.7
What is variance
Average squared distance to the mean
What is a z score
The number of standard deviations away from the mean
What measures of center do we have
Mean, Median, Mode
What measures of spread do we have
Standard deviation, variance, range, IQR
what center and spread for unimodal/symmetric data
Mean and standard deviation
What center and spread for skewed data or outliers
Median and IQR
How do you describe distributions (Histograms)
Shape, center, spread, strange
How do you describe the shape of a histogram
Modes and symmetry
What is visual display for quantitative data
Histogram, box and whisker, dot, time, stem, ogive, normal probability
What is visual display for categorical data
Segmented bar, pie, bar, mosaic
What is the IQR
Interquartile range, the width of the middle 50% of the data, Q3-Q1
What are the outliers
1.5 IQR below Q1 and 1.5 IQR above Q3
What percent of the data is contained in the IQR
50%
Norm CDF inputs
HI, LO mu, sigma = percent
Inv Norm inputs
Percentile, mu, sigma = value
What is bivariate data
Two variables, when you measure two variables from each subject
How can you think about independent and association
They are opposites, independent means no relationship, associated means there is a relationship
Example of independent
Categorical: Gender and pizza preference
Quantitative: Height and IQ
Example of associated
Categorical: Gender and video game playing
Quantitative: study time and test score
With categorical bivariate, what would independence look like
Similar/same percent distribution across
With quantitative bivariate, what would independence look like
The regression slope would be zero
How do you describe scatterplots
Direction, form, strength, strange
What does R value tell you
Direction and strength of a model
What does R^2 tell you
The percent of variability in Y explained by the model with X
What is Sy/Sx
This is the slope of the regression line
What point is on every regression line
X-bar, Y-bar
Slope in context
Model says for each unit of X, there is an increase/decrease of slope units of Y on average
Y intercept in context
Where there is no X stuff, the model predicts this much Y stuff
What is a residual
Vertical distance from a point to the regression line. It is how far off the true value is from the models prediction
What do you want the residual plot to look like
Random, no pattern
What does standard deviation of residuals tell you
It is the average, or typical, residual. It is about how far off you expect the predictions to be
How do you find outliers in regression
They don’t follow the FLOW
What is the average of all the residuals
ZERO
What does S tell us in a regression output
The typical distance from the model. How far off the model is on average. Expected amount the model will be off by
Difference between scatter plot and resid plot on calculator
L1vL2
L1vRESID