Unit 2 Flashcards
A good regression line makes the residulas ____\
as small as possible
Least-squares regression line meaning
the line that makes the sum of the squared rsiduals as small as possible
##Footnote
goal is to add all the residuals and get =0, as the positive and negative reisduals will cnacel each other out. But even bad fitting lines get 0.50 so to avoid the flaw we square the residulas befor adding & the best line is LSR
Formula for calculationg the sloep fo the LSRL…
m = r * (Sy / Sx)
##Footnote
where “r” is the correlation coefficient, “Sy” is the standard deviation of the y values, and “Sx” is the standard deviation of the x values.
Regression to the mean meaning…
“Level out”, Find “true” values, “regress t o average”
Formula to caluacluate the y-interecept in the LSRL
b = ȳ - m * x̄
##Footnote
where “b” is the y-intercept, “ȳ” is the mean of the y-values, “m” is the slope, and “x̄” is the mean of the x-values.
residual plot meaning
a scatterplot that displays the residuals on the vertical axis and the values of the explanatory variables (or predicted values) on the horizontal axis
what is the purpose of a residual plot?
to help show if the regression line is the best fit; if it helps with the form
Hoq so you interpret a residual plot
How do you know when a line is “fit” for a relationship in the regression plot?
By the lack of a leftover curved pattern, has random scatterinsert pic
If the regression plot has a leftover curved pattern in the residual plot what should we do?
Consider using a regression model with different form; aka not linear
Coefficient of determination r^2 definition..
measures the proportion (or percentage) of variation in the response variable that is accounted for (or explained) by the explanatory vairable in the linear model
the standard deviation of the residuals s, defintion…
the typical distance between actual y-values and predicted y-values
Interpreting s points
- mention the actual value of the response variable w.context
- the s or how much away the response variable is
- mention of the predicted value w.context
Interpreting s; The actual ____\ is tupically about ____\ away from the predicted value by the LSRL w/ x= ____\
The actual (response variable) is typically about (s) away from the predicted valye by the LSRL w/ x= (explanatory variable)
high leverage point
points with high leverage in regression have much larger or much xmaller x-values than other points in data set
outlier
in regression the point that does not follow the pattern of the data and has a large residual
influential point
in regression any point that if removed substanially changes the slope, y intercept, correlation, coefficient of determination, or standard deviaiton of the residuals
which model of a residual plot that models the relationship between x and y is the best?
the one which…
- has the most random scatter
- if more than 1 has arandom scatter choose mdoel w/ largest coefficent of r^2