8: Regression Flashcards
When is linear regression used?
When the relationship between x and y can be described with a straight line. Allows us to estimate how much y will change as a result of a given change in x.
What is the variable being predicted known as (y)?
Outcome, dependent, criterion.
What is the variable being used to predict known as (x)?
Predictor, independent, explanatory.
What are the assumptions of linear regression?
Normal distribution, linear relationship, no outliers, and sensitive to range restrictions.
What are the three stages of linear regression?
- Analysing the relationship.
- Proposing a model to explain the relationship.
- Evaluating the model.
Regression equation
y = bx + a. a = intercept, b = slope.
What are the two models, in relation to goodness of fit?
The simple model and the improved model.
What is total variance and how is it calculated?
Variance not explained by the mean of y (simple model). Calculate difference between each data point and the mean, square them, and add them together (SST).
What is residual variance and how is it calculated?
Variance not explained by the regression model. Calculate difference between predicted value of y and actual value, square them, and add them together (SSR).
What does the difference between SST and SSR represent?
The improvement in prediction using the regression model compared to the simplest model. An ANOVA can be used to evaluate this.
F =
MSM / MSR
What does R represent?
The strength of the relationship between x and y.
What is adjusted R^2?
R^2 adjusted to account for degrees of freedom.
What is multiple regression and when is it used?
Allows us to assess the influence of several predictor variables on the outcome variable (y). The slopes of each predictor variable are combined.
What is the regression equation for multiple regression?
y = (b1 x x1) + (b2 x x2) + (b3 x x3) + a.