Multiple Linear Regression Flashcards by Georgia Elouise

How does multiple linear regression differ from simple linear regression?

▪️Includes more predictors/IVs
▪️See if they fit a regression plane (instead of a line)

y = β0 + β1x1 + β2x2

How well did you know this?

Not at all

Perfectly

When do you use multiple linear regression?

To study relationship between one DV and two or more IVs simultaneously

How well did you know this?

Not at all

Perfectly

What is the partial regression coefficient (βi)?

The amount Y will change for each uni increase in IV x1 whilst holding all other variables constant

How well did you know this?

Not at all

Perfectly

What is the null hypothesis for a multiple linear regression?

Holding all other variables constant, there is not linear association between y and x1

How well did you know this?

Not at all

Perfectly

What is a confounding variable?

Any variable that may distort the observed association between and explanatory variable and outcome

Has an effect on both variables

How well did you know this?

Not at all

Perfectly

What happens if you don’t take confounder into account?

Introduce bias in the estimation of β1

How well did you know this?

Not at all

Perfectly

How many confounder can you consider in a multiple regression model?

Usually one IV for each 10 observations

E.g. if n=100, can consider 10 IVs, 9 confounder

How well did you know this?

Not at all

Perfectly

What is R-squared?

The coefficient of determination

Measure of how well regression line/hyperplane approximates real data points (goodness of fit)

How well did you know this?

Not at all

Perfectly

What does an R-squared of 0 indicate?

Poor fit - regression line would be perfectly horizontal

How well did you know this?

Not at all

Perfectly

What does an R-squared of 1 indicate?

Perfect fit

How well did you know this?

Not at all

Perfectly

What is R-squared in a simple linear regression?

Pearson’s coefficient squared (r-squared)

How well did you know this?

Not at all

Perfectly

How do you interpret R-squared?

The proportion of variance in the DV that is “explained” by the IVs in the model

How well did you know this?

Not at all

Perfectly

How is R-squared interpreted in the context of prediction analysis?

How well the model will be able to predict values of Y based in observed values of IVs

How well did you know this?

Not at all

Perfectly

What does R-squared NOT indicate?

▪️The IVs CAUSE changes in the DV
▪️Correct type of regression was used
▪️Most appropriate IVs were chosen
▪️There’s enough data for a solid conclusion

How well did you know this?

Not at all

Perfectly

What is adjusted R-squared?

Modified to adjust for the number of IVs in the model.

R-squared increases whenever a new IV is added regardless of how informatige it is

How well did you know this?

Not at all

Perfectly

What statistic is the better indicator for model selection?

Study These Flashcards

Adjusted R-squared

What do we assume about the relationship of variables for multiple regression inference?

Study These Flashcards

It’s linear

What does a partial residual plot show?

Study These Flashcards

The net relationship between X1 and Y where the influence of other variables is partialled out

What do we assume about the residuals/error terms for multiple regression inference?

Study These Flashcards

They’re approximately normally distributed

How do we plot a partial residual plot?

Study These Flashcards

Residuals of DV against residuals of each IV separately

What is the residual error?

Study These Flashcards

Variation in Y not explained by predictor

How can we see whether the error terms are normally distributed?

Study These Flashcards

▪️Histogram
▪️Normal P-P plot - plotted against theoretical normal distribution

What is homoscedasticity?

Study These Flashcards

▪️Stability in variance of residuals
▪️A scatterplot of standardised residuals and standardised predicted values shows no pattern
▪️Error values have same variance irrespective of x

Observations used for regression models must be ___________

Study These Flashcards

Independent

(can’t use repeated measurements, paired data or matched data)

What assumptions do we have to check for when interpreting multiple linear regression?

▪️Linearity ▪️Normal distribution of residuals ▪️Homoscedasticity (ZPred vs Zresid)

R-squared is a coefficient for what?

Measuring how well the regression line/hyperplane approximates the real data points

Multiple Linear Regression Flashcards

(26 cards)