3 Flashcards
Linear regression model
uses an explanatory variable to predict the response variable
Explanatory variable
x
Response variable
y
Predicted response variable
y with hat on top
Predicted response variable equation
y^ = a + bx
Constant
a
Slope
b
Extrapolation
occurs when a linear model is used to predict a response value for explanatory variable that is beyond the interval of x-values to determine the regression line
Least-Squares Regression Line
Line of best fit
Sum of squared residuals as small as possible
Helps predict the other value
Slope (b)
the amount by which y is predicted to change when x increases by 1
There is one point all regressions pass through and that is
(x^-,y^-) mean
When L1 & L2 are filled out, how can you calculate LSRL equation
stat -> calc -> 8
Identify the slope of the regression line found in example and explain what it means in context
A (x context) increases by 1 unit, the predicted (y context) increases by (b)
Identify the y intercept of the regression line found in example and explain what it means in context
When the (x context) is 0, the predicted (y context) is (a) (y units)
Slope of regression line formula
b = r(Sy/Sx)
r^2
coefficient of determination. The proportion of variation in the response variable that is EXPLAINED by the explanatory variable
What are the coefficients of the least squares regression model
y intercept (a) and estimated slope (b)
y intercept formula
a = y^- - b^-
Standard deviation of residuals
gives the typical error
r^2 formula
r^2 = 1 - (E(y - y^)^2/E(y-y^-)^2
r^2 is expressed as
a percentage and does not have units
r^2 and S both tell us how well the linear model fits our data so
always make note of both when analyzing data
The standard deviation of the residuals is measured in
the units of the response variable
Interpret the coefficient of determination
(r^2)% of the variation in (y context) is explained by (x context).
Interpret the SD of the residuals
The typical error in (y context) based on (x context) is (S) (response units)
Constant
a = y int
Influential observation
A point that if removed will change relationship (r) dramatically
High leverage point
A point that has a substantially larger or smaller x value compared to the other ovservations
An influential point may have a small residual but
still have a great effect on the regression line
Most extreme x values will not necessarily have the largest residuals but
usually have the most impact on the regression line and correlation
Extreme values in the x or y direction usually affect the slope in a similar way but
x direction outliers tend to change the correlation more drastically
Residual
left over vertical variation in the response variable (y) from the LSRL
Every observation will have a residual
Residual plot
Scatter plot of residuals plotted against explanatory variable. Used to determine if a linear model is appropriate for certain data sets
Residual formula
= y - y^