Multiple Linear Regression Flashcards

1
Q

How does multiple linear regression differ from simple linear regression?

A

▪️Includes more predictors/IVs
▪️See if they fit a regression plane (instead of a line)

y = β0 + β1x1 + β2x2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When do you use multiple linear regression?

A

To study relationship between one DV and two or more IVs simultaneously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the partial regression coefficient (βi)?

A

The amount Y will change for each uni increase in IV x1 whilst holding all other variables constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the null hypothesis for a multiple linear regression?

A

Holding all other variables constant, there is not linear association between y and x1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a confounding variable?

A

Any variable that may distort the observed association between and explanatory variable and outcome

Has an effect on both variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What happens if you don’t take confounder into account?

A

Introduce bias in the estimation of β1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How many confounder can you consider in a multiple regression model?

A

Usually one IV for each 10 observations

E.g. if n=100, can consider 10 IVs, 9 confounder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is R-squared?

A

The coefficient of determination

Measure of how well regression line/hyperplane approximates real data points (goodness of fit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does an R-squared of 0 indicate?

A

Poor fit - regression line would be perfectly horizontal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does an R-squared of 1 indicate?

A

Perfect fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is R-squared in a simple linear regression?

A

Pearson’s coefficient squared (r-squared)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you interpret R-squared?

A

The proportion of variance in the DV that is “explained” by the IVs in the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is R-squared interpreted in the context of prediction analysis?

A

How well the model will be able to predict values of Y based in observed values of IVs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does R-squared NOT indicate?

A

▪️The IVs CAUSE changes in the DV
▪️Correct type of regression was used
▪️Most appropriate IVs were chosen
▪️There’s enough data for a solid conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is adjusted R-squared?

A

Modified to adjust for the number of IVs in the model.

R-squared increases whenever a new IV is added regardless of how informatige it is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What statistic is the better indicator for model selection?

A

Adjusted R-squared

17
Q

What do we assume about the relationship of variables for multiple regression inference?

A

It’s linear

18
Q

What does a partial residual plot show?

A

The net relationship between X1 and Y where the influence of other variables is partialled out

19
Q

What do we assume about the residuals/error terms for multiple regression inference?

A

They’re approximately normally distributed

20
Q

How do we plot a partial residual plot?

A

Residuals of DV against residuals of each IV separately

21
Q

What is the residual error?

A

Variation in Y not explained by predictor

22
Q

How can we see whether the error terms are normally distributed?

A

▪️Histogram
▪️Normal P-P plot - plotted against theoretical normal distribution

23
Q

What is homoscedasticity?

A

▪️Stability in variance of residuals
▪️A scatterplot of standardised residuals and standardised predicted values shows no pattern
▪️Error values have same variance irrespective of x

24
Q

Observations used for regression models must be ___________

A

Independent

(can’t use repeated measurements, paired data or matched data)

25
Q

What assumptions do we have to check for when interpreting multiple linear regression?

A

▪️Linearity
▪️Normal distribution of residuals
▪️Homoscedasticity (ZPred vs Zresid)

26
Q

R-squared is a coefficient for what?

A

Measuring how well the regression line/hyperplane approximates the real data points