10. Model Comparisons Flashcards
What is used to test if multiple variables are significant/add to the model? (Name only)
F-Test for overall model significance
Incremental F-Test for overall model significance
How does an F-Test work?
Contains F-ratio (the ratio of explained variance to unexplained variance)
F-ratio is is found through the mean sum of squares
F-ratio is then evaluated against an F-distribution with df model, df residual and pre-defined alpha
How do we know through the F-test if our model is significant or not?
Bigger the F-ratio = Better model as model variance is larger than the residual variance
F-ratio = Closest to 1 if null hypothesis is true
F > 1 = Increased model variance
What is an incremental F-test?
The incremental F-test evaluates the statistical significance of the improvement in variance explained in an outcome with the addition of further predictor(s)
How does the incremental F-test work?
Based on the difference in F-values between models
A null or empty model is a linear model with only the intercept, predicted value of the outcome is the mean of the outcome (“the least wrong estimate”), beta = 0
For every predictor added, DF increases
How do you interpret an incremental F-test from anova?
Compare the p-value for the F-test to your significance level. If the p-value is less than the significance level, your sample data provide sufficient evidence to conclude that your regression model fits the data better than the model with no independent variables.
What is a nested model?
Predictors in one model are a subset in another
What is a non-nested model?
Unique variables in both models
What can you use to compare models if they are based on the same data set and are nested?
Incremental F-Test
AIC
BIC
What can you use to compare models if they are based on the same data set and are non-nested?
AIC
BIC
Can we compare models that are not from the same data set?
No
What are parsimony corrections?
Penalises models for being complex = Helps avoid overfitting (adding predictors arbitrarily to make it fit)
What are AIC and BIC parsimony corrections?
BIC have a harsher penalty than AIC for typical sample sizes
Severe parsimony penalty lm(n) > 2
BIC = Difference of 10 shows that one model is better than another
No cut offs for AIC in establishing substantial difference
What are AIC and BIC parsimony corrections?
BIC have a harsher penalty than AIC for typical sample sizes
Severe parsimony penalty lm(n) > 2
BIC = Difference of 10 shows that one model is better than another
No cut offs for AIC in establishing substantial difference