Lecture 39- Multiple Linear Regression Flashcards

Question 1

Q

What is the general idea of multiple linear regression as opposed to simple linear regression?

Answer

A

Simple linear regression allow us to assess the effect of a single
explanatory variable (x) on a response variable (y).

Multiple linear regressions are when you have multiple x’s/ explanatory variables to explain your outcome

Question 2

Q

What is the model in multiple linear regression given by?

Answer

A

Y = β0 + β1x1 + · · · + βk xk + e

k denotes the number of explanatory variables;
β0, β1, . . . , βk are parameters (regression coefficients);
e is an error term following a N(0, σ2e) distribution.

Question 3

Q

What do you take of the multiple linear regression equation to get the mean response as predicted by multiple x’s/ explanatory variables? What is this known as?

Answer

A

Take of the e (error term)

- Conditional mean of Y given the fixed values of the predictor variables.

Question 4

Q

Why is being able to do multiple linear regression important i.e. what are it’s applications?

Answer

A

Adjusting for the effect of confounding variables.
Establishing which variables are important in explaining the values of the response variable and what is just extra noise
Predicting values of the response variable.
Describing the strength of the association between the response variable and the explanatory variables.

Question 5

Q

How do we visualize multiple regression?

Answer

A

Exits in 3D, plane is put to go through points as close as possible

Question 6

Q

How is the fitted multiple regression model different to the model at the population level?

Answer

A

At population level have parameters in reality am unlikely to know actual parameters, instead have to estimate and put hats on top of variables

Question 7

Q

How do we measure the overall quality of predictions with our model?

Answer

A

Through the sum of squared errors (RSS), looks at the difference between the predictions via the model and actual responses

Question 8

Q

What are the least squares estimate?

Answer

A

The value of parameters that minimizes RSS (difference between predictions via model and actual values)

Question 9

Q

How do you use the residual sum of squares to estimate error variance?

Answer

A

RSS/ (n-k-1)

k=number of predictors

Question 10

Q

What happens as you add variables to the multiple linear regression?

Answer

A

The amount of error reduces (you explain more).

However, can get to point where ‘too much’ of the data is being explained

Question 11

Q

How do you read the output from R to tell which explanatory variables are valuable in the model?

Answer

A

Look at estimate column and values after the intercept. Each line is a different explanatory variable with the P value (less than 0.05) showing whether or not the variable is important to the model or can/ should be removed.

Question 12

Q

What happens in R if there is a missing value?

Answer

A

Omitted from model fitting

- This is a huge problem in lots of data sets

Question 13

Q

Sub the parameter estimate values from slide 747 to the multiplier linear regression equation in order to make a prediction estimate for Y

Answer

A

Answers in slide

Question 14

Q

How do you interpret the coefficients for multiple regression?

Answer

A

-The intercept β0 is the predicted value of the response when all explanatory variables are zero.
-Other coefficients are specific to the associated explanatory variable.
For example, β2 is the change in the mean response when variable x2
is increased by one unit… and all other explanatory variables remain unchanged.

Question 15

Q

Interpret βˆ1i for the example on slide 752?

Answer

A

Interpretation of βˆ
1 is that male students are estimated to be
15.08cm taller than female students on average, having adjusted for
father’s height and age (other variables remain the same)

Question 16

Q

How do the mean response and predictor differ in multiple linear regression? and how does the prediction interval change?

Answer

Study These Flashcards

A

While the mean estimate and prediction are identical, a 95%
confidence interval for the mean response will be much narrower than
the prediction interval for a new response.
The reasoning is exactly the same as for simple linear regression.

Question 17

Q

Calculate a 95% prediction interval for the example on slide 755?

Answer

Study These Flashcards

A

Answer on slide

Lecture 39- Multiple Linear Regression Flashcards

(17 cards)