Model Selection in Multiple Regression Flashcards
What is an underfitted model?
A model that is too simple
What is the danger with underfitted and overfitted models?
Poor predictive abilities
What type of variables do we want to include in our model?
- Variables that have a genuine relationship with the response
- Variables that offer a sufficient amount of new information about the response
What happens when collinear variables are fitted together in a model?
- The resulting model unstable
- Often obtain inflated standard errors for these estimates
What can be used to detect Collinearity?
Variance Inflation Factors (VIFs)
Write the equation for VIFs
1/(1-R^2p)
What R command gives the p values for all predictors assuming that the term is the last in the model?
Anova
What can nested and non-nested models be compared with?
Information based fit criteria
Give examples of information based fit criteria
AIC or BIC
Describe Occam’s Razor in this context
When comparing models of equal explanatory power, one should choose the simplest
What is the AIC statistic?
Measure of fit which is penalized for the number of parameters estimated in a model?
What does a smaller AIC value signal?
A better model
When would you use AICc over AIC?
When the sample size is not a great deal larger than the number of parameters
Give the formula for AICc
AICc = AIC + 2P(P+1)/N-P-1
Why does the BIC score differ from the AIC?
BIC employs a penalty that changes with the sample size (N)