Week 8 (Simple Linear Regression) Flashcards

1
Q

Regression

A

-Uses correlation to predict values of one variable from another
-Prediction is done by finding a regression line that best represents the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Regression terminology

A

-X axis: Predictor/ explanatory/ Independent variable
-Y axis: Outcome / criterion / dependent variable (DV)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Simple regression equation

A

Y = b0 + b1 X + e
b0 = intercept
b1 X = slope of the line
e = error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Error in regression equation

A

Residual or prediction error
-The difference between the observed value of the outcome variable and what the model predicts (e = Yobs - Ypred)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Residual sum of squares

A

-Residuals can be positive or negative
-If we add the residuals, the positive ones will cancel out the negative ones, so we square them before we add them up
-We refer to this total as the sum of squared residuals or residual sum of squares (denoted by SSr)
-SSr is a gauge of how well the model (line) fits the data: the smaller SSr, the better fit.
-Tells us how much ERROR there is in the model
-BUT doesn’t tell us whether the model is better than nothing (the line of best fit can still be a lousy fit)
-We need to compare the model against a baseline to see whether it improves our prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Total Sum of Squares (SSt)

A

-Using the mean of observed Y as a baseline model
-Assuming no relationship between Y and X
-The sum of squared differences between the Yobs and the sample mean
-In this baseline model: SSt = SSr

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Model sum of squares

A

SStotal = SSmodel + SSresidual
-Sum of squared differences between the Ypred and the sample mean.
-It represents the improvement from the baseline model to the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

R squared

A

Variance in the outcome explained by the model (divided by) total variance in the outcome variable to be explained
-This provides the proportion of variance accounted for by the model
-R squared value ranges between 0 and 1. The higher the value, the better the model.
-Interpret R squared as a percentage
-E.g, if you have an r squared of .69, 69% f the variance in the outcome variable is explained by the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly