GLM model selection Flashcards
Model selection criterions
- Parsimony
- Accuracy
In case of mispecification errors: - Excluding relevant regressors = bias in β^
- Including irrelevant regressors = variability in β^
Model selection in nested models
- σ2 known: lr = (SSEB - SSEA) / σ2 | model B ~ χ2 q=|A|-|B|
- σ2 unknown: F = (n-|A|)/q * SSEB - SSEA) / SSEA = (n-|A|)/q * ( R2A - R2B ) / (1-R2A) | model B ~ Fq, n-|A|
Model selection in non-nested models
- Adjusted R2
- Cross validation (CV)
- Akaike Information Criterion (AIC)
- Bayesian Informatino Criterion (BIC)
Adjusted R2
R_2 = 1 - (n-1)/(n-p) (1-R2)
It can be less than 0 if many irrelevant regressors are added, it is concordant with R2 if |A|=|B|
BEST = MAX
Cross validation+deleted residuals
CV = PRESS = Σ (yi - y^iM-i)2 / n
Deleted residuals yi - y^iM-i = ε^i / 1 - hiiM
hiiM is the i-th diagonal element of the hat matrix H = X(X’X)-1X’
BEST = MIN
Akaike Information Criterion
AIC = -2l(θ^)+2(p+1) =GLM= nln(σ^2ML) + 2(p+1)
We choose the model that contains the element most similar to f0
Under regularity assumptions can select the “best” model even if there is no test to evaluate it.
In GLMs if |A|=|B| we have AIC min ↔ SSE min ↔ R2 max
BEST = MIN
Bayesian Information Criterion
BIC = -2l(θ^) + ln(n)(|M|+1) =GLM= nln(σ^2ML) + ln(n)(|M|+1)
Puts more weight on the complexity than AIC, under regularity assumptions can select the “best” model even if there is no test to evaluate it.
In GLMs if |A|=|B| we have BIC min ↔ SSE min ↔ R2 max
BEST = MIN
Nested GLM model selection in R
anova(fitmin, fitmax)
- Res.Df = n - pi ∀ i
Non-nested GLM model selection in R
AIC(fit) #Simplified for GLMs, includes n+nln(2π)
extractAIC(fit)
extractAIC(fit, k=log(nrow(data))) #BIC