Multiple Linear Regression Flashcards by Tammy Okhiria

Write the regression equation.

y = b0 + b1 x + e

How well did you know this?

Not at all

Perfectly

Why will the estimate value of y not be perfect?

Because there are residuals or errors, which is what ‘e’ stands for.

How well did you know this?

Not at all

Perfectly

What is a regression line used for?

To make predictions about the value of the dependent variable based on a value of the predictor.

How well did you know this?

Not at all

Perfectly

What happens to R^2 when more variables are added?

R^2 always increases even when the new variables have no predictive power.

How well did you know this?

Not at all

Perfectly

How do we know if two models are nested?

If one model contains all the terms of the other, and at least one additional term.

How well did you know this?

Not at all

Perfectly

What is b(0)?

The value of y variable when x variable=0.

How well did you know this?

Not at all

Perfectly

What is b(1)?

This is the amount of change in variable y for one unit change of variable x.

How well did you know this?

Not at all

Perfectly

Write the multiple regression equation.

y = b(0) + b(1)x(1) + b(2)x(2)… + b(n)x(n) + e

How well did you know this?

Not at all

Perfectly

What is a fully specified model?

A model in which we have accounted for all factors that determine variation in the dependent variable (y).

How well did you know this?

Not at all

Perfectly

Why can’t we usually have a fully specified model?

We cannot measure all the factors that affect y

How well did you know this?

Not at all

Perfectly

What is the relationship between the t and p values?

As the t value increases, the p value increases

How well did you know this?

Not at all

Perfectly

What is the t value for significance at 0.05 level of confidence?

+/- 1.96

How well did you know this?

Not at all

Perfectly

What does rejecting the null hypothesis mean?

Our relationship is not likely to have occurred by chance
Our relationship is likely to be reflected in the population

How well did you know this?

Not at all

Perfectly

What do we do when we want to add a categorical variable such as sex to the model?

We create a variable that takes the values “0” and “1” for men and women, respectively.

How well did you know this?

Not at all

Perfectly

What do we do if there are multiple categories?

We create multiple dummy variables where one category is a reference and left out of the model.

How well did you know this?

Not at all

Perfectly

What does R^2 mean?

Study These Flashcards

How much variability in the dependent variable is explained for, e.g., an R-squared value of 0.66 means that 66% of the variance in the y variable is
explained by the x variables.

How do we interpret the p value?

Study These Flashcards

If the p value is less than the alpha value of 0.05 then we know that our model has at least one significant independent variable.

How does the forward stepwise selection method work?

Study These Flashcards

Begins with no variables and introduces variables one by one
Add variables that increase R2 the most
Continue this procedure until none of the remaining variables explain a significant amount of the additional variability in y

How does the backwards stepwise selection method work?

Study These Flashcards

Starts will all variables in the model
Drops variables that contribute least to R2
Process continues until remaining variables explain a significant proportion of variability in y

How do we compare coefficients that are measured in different units?

Study These Flashcards

We standardise the coefficients into beta coefficients.

How do we interpret beta coefficients?

Study These Flashcards

As “a one standard deviation unit increase in x leads to a ___standard deviation unit increase/decrease in y.”
This way, we can compare continuous IVs to see which has the largest association.

Why should we not standardise categorical variables?

Study These Flashcards

Because a 0/1 dummy variable cannot be increased by one standard deviation

What does the relaimpo package do?

Study These Flashcards

Provides measures of relative importance for each of the predictors in the model by entering regression variables in all possible orders, and then averaging the changes in the R2.

What does adjusted R^2 do?

Study These Flashcards

The adjusted R2 controls for the number of variables we have included in our model, so it avoids the problem of R^2 increasing when more variables are added.

How do we control for multicollinearity through plots?

- Observe the correlation between the predictors in the model by plotting them against each other. - If correlation between any two predictors is strong, then one of them needs to be removed from the model

What is Variance Inflation Factor (VIF)?

VIF larger than 5 or 10 indicates serious problems with collinearity.

Why is multicollinearity a problem?

- If a predictor is strongly related to some other input, then we are simply adding redundant information to the model - It can be difficult to separate the effects of the multicollinear predictors

Multiple Linear Regression Flashcards

(27 cards)