Agresti chapter 3 Flashcards
Regression line
Predicts the value for the response variable y as a straight-line function of the value x of the explanatory variable.
Predicted value of y
y-hat
Equation for regression (aka prediction equation) line
Y-hat = a + bx.
a denotes the Y-intercept
b denotes the slope.
Residuals
Prediction error. The actual value y - the predicted value y-hat. Absolute value is the distance between the point in the plot and the regression line
Residual sum of squares (RSS)
Sum(residual)^2 = sum (y - y-hat)^2.
The better the line, the smaller the residuals and the smaller the RSS. The line is called the least squares method
Slope formula
Slope = b = r(Sy / Sx). X-bar is the mean of x, Y-bar the mean of y, Sx the SD of x and Sy the SD of y
Y-intercept formula
Y-intercept = a = y-bar - b(x-bar)
Influential observation
When the x value is relatively high or low compared to the rest of the data and lies far from the trend line
Lurking variable
third variable that explains the correlation. Correlation does not imply causation. Different from confounding, which is associated both with the response variable and explanatory variable.