Week 8 (Simple Linear Regression) Flashcards
Regression
-Uses correlation to predict values of one variable from another
-Prediction is done by finding a regression line that best represents the data
Regression terminology
-X axis: Predictor/ explanatory/ Independent variable
-Y axis: Outcome / criterion / dependent variable (DV)
Simple regression equation
Y = b0 + b1 X + e
b0 = intercept
b1 X = slope of the line
e = error
Error in regression equation
Residual or prediction error
-The difference between the observed value of the outcome variable and what the model predicts (e = Yobs - Ypred)
Residual sum of squares
-Residuals can be positive or negative
-If we add the residuals, the positive ones will cancel out the negative ones, so we square them before we add them up
-We refer to this total as the sum of squared residuals or residual sum of squares (denoted by SSr)
-SSr is a gauge of how well the model (line) fits the data: the smaller SSr, the better fit.
-Tells us how much ERROR there is in the model
-BUT doesn’t tell us whether the model is better than nothing (the line of best fit can still be a lousy fit)
-We need to compare the model against a baseline to see whether it improves our prediction
Total Sum of Squares (SSt)
-Using the mean of observed Y as a baseline model
-Assuming no relationship between Y and X
-The sum of squared differences between the Yobs and the sample mean
-In this baseline model: SSt = SSr
Model sum of squares
SStotal = SSmodel + SSresidual
-Sum of squared differences between the Ypred and the sample mean.
-It represents the improvement from the baseline model to the regression line
R squared
Variance in the outcome explained by the model (divided by) total variance in the outcome variable to be explained
-This provides the proportion of variance accounted for by the model
-R squared value ranges between 0 and 1. The higher the value, the better the model.
-Interpret R squared as a percentage
-E.g, if you have an r squared of .69, 69% f the variance in the outcome variable is explained by the model.