W8&9 - Multiple Linear Regression Flashcards
What does simple linear regression do?
Quantifies the variance (R^2) in the DV that can be explained by the variance in the IV.
What does multiple linear regression test?
What combination from several IV explains the variance in the DV.
Can IV overlap in multiple linear regression?
YES
They can correlated with one another to explain the variance in the DV
What is unexplained variance known as especially when looking at a graph?
Residuals
How many people should you have per variable in the final model for multiple linear regression and why?
At least 10 per variable
Otherwise the variable coefficient can become unreliable.
What is the forced entry approach on SPSS in regards to multiple linear regression?
Means variables can still be in the model even if they’re not significant.
List the types of multiple regression model building processes
Stepwise/forward
Hierarchical
Forced entry
Define stepwise
Data driven + SPSS selects which variables are entered
Define hierarchical
Researcher decides the order in which variables are entered
Define forced entry
All predictors are entered into 1 model simultaneously
Which out of the types of multiple regression model building processes doesn’t determine the unique variance that each IV adds to the model?
Forced entry
What does the stepwise identify?
IV that explains the most variance in the DV + puts it in step 1 of the model.
What does the stepwise do once it has identified the IV that explains most the variance in the DV?
Looks for the IV that explains the most of the remaining unexplained variance + is included in the model provided it explains a SIG amount of the remaining variance.
This is repeated until there are no IVs left that explain further variance w/ a p<0.05.
When is stepwise used?
When data driven
Not a theory drive
And if there are too many variables that you couldn’t possible know which ones are most predictive.
Define partial correlation
How strongly each of the variables correlate with the remaining information once we have explained or removed the variation one can explain.
What does a higher partial correlation do to the p value?
Makes it smaller
Hierarchical regression analysis
Researcher decided order in which variables are enter
- Order should be based on previous research or a plausible theory.
What is normally entered in step 1 of the hierarchical regression analysis?
Known confounders
i.e age, gender + ethnicity
What is normally entered in step 2 of the hierarchical regression analysis?
Known predictors
What is normally entered in step 3 of the hierarchical regression analysis?
Test variables
When would a hierarchical regression analysis be used over stepwise or forced entry?
When a ‘new’ variable of interest needs to be tested as for whether it explains further variation in a DV
List the assumptions for multiple regression analysis
No multi-collinearity between predictors
Homoscedasticity of residuals
Linearity of residuals
Normality of residuals
When does multi-collinearity occur?
When at least 2 predictors are highly correlated w/ each other in the final model.
What can multi-collinearity lead to?
Unreliable regression b-coefficients for the predictors.
What does the VIF (variance inflation factor) tell us?
How much the SE of the b-coefficient have been inflated.
Don’t want this as it widens the CI meaning you have less chance of showing that coefficient to be stat sig from 0.
VIF>5 (r>0.90)
Maybe re-run regression after removing 1 of the ‘highly correlated’ variables (the least sig)
VIF = 3.33
Assume there is no or low multi-collinearity