Lecture 39- Multiple Linear Regression Flashcards

1
Q

What is the general idea of multiple linear regression as opposed to simple linear regression?

A
Simple linear regression allow us to assess the effect of a single
explanatory variable (x) on a response variable (y).

Multiple linear regressions are when you have multiple x’s/ explanatory variables to explain your outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the model in multiple linear regression given by?

A

Y = β0 + β1x1 + · · · + βk xk + e

k denotes the number of explanatory variables;
β0, β1, . . . , βk are parameters (regression coefficients);
e is an error term following a N(0, σ2e) distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What do you take of the multiple linear regression equation to get the mean response as predicted by multiple x’s/ explanatory variables? What is this known as?

A
  • Take of the e (error term)

- Conditional mean of Y given the fixed values of the predictor variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is being able to do multiple linear regression important i.e. what are it’s applications?

A
  • Adjusting for the effect of confounding variables.
  • Establishing which variables are important in explaining the values of the response variable and what is just extra noise
  • Predicting values of the response variable.
  • Describing the strength of the association between the response variable and the explanatory variables.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we visualize multiple regression?

A

Exits in 3D, plane is put to go through points as close as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is the fitted multiple regression model different to the model at the population level?

A

At population level have parameters in reality am unlikely to know actual parameters, instead have to estimate and put hats on top of variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we measure the overall quality of predictions with our model?

A

Through the sum of squared errors (RSS), looks at the difference between the predictions via the model and actual responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the least squares estimate?

A

The value of parameters that minimizes RSS (difference between predictions via model and actual values)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you use the residual sum of squares to estimate error variance?

A

RSS/ (n-k-1)

k=number of predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What happens as you add variables to the multiple linear regression?

A

The amount of error reduces (you explain more).

However, can get to point where ‘too much’ of the data is being explained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you read the output from R to tell which explanatory variables are valuable in the model?

A

Look at estimate column and values after the intercept. Each line is a different explanatory variable with the P value (less than 0.05) showing whether or not the variable is important to the model or can/ should be removed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What happens in R if there is a missing value?

A
  • Omitted from model fitting

- This is a huge problem in lots of data sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Sub the parameter estimate values from slide 747 to the multiplier linear regression equation in order to make a prediction estimate for Y

A

Answers in slide

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you interpret the coefficients for multiple regression?

A

-The intercept β0 is the predicted value of the response when all explanatory variables are zero.
-Other coefficients are specific to the associated explanatory variable.
For example, β2 is the change in the mean response when variable x2
is increased by one unit… and all other explanatory variables remain unchanged.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Interpret βˆ1i for the example on slide 752?

A

Interpretation of βˆ
1 is that male students are estimated to be
15.08cm taller than female students on average, having adjusted for
father’s height and age (other variables remain the same)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do the mean response and predictor differ in multiple linear regression? and how does the prediction interval change?

A

While the mean estimate and prediction are identical, a 95%
confidence interval for the mean response will be much narrower than
the prediction interval for a new response.
The reasoning is exactly the same as for simple linear regression.

17
Q

Calculate a 95% prediction interval for the example on slide 755?

A

Answer on slide