GLM model selection Flashcards

1
Q

Model selection criterions

A
  • Parsimony
  • Accuracy
    In case of mispecification errors:
  • Excluding relevant regressors = bias in β^
  • Including irrelevant regressors = variability in β^
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Model selection in nested models

A
  • σ2 known: lr = (SSEB - SSEA) / σ2 | model B ~ χ2 q=|A|-|B|
  • σ2 unknown: F = (n-|A|)/q * SSEB - SSEA) / SSEA = (n-|A|)/q * ( R2A - R2B ) / (1-R2A) | model B ~ Fq, n-|A|
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Model selection in non-nested models

A
  • Adjusted R2
  • Cross validation (CV)
  • Akaike Information Criterion (AIC)
  • Bayesian Informatino Criterion (BIC)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Adjusted R2

A

R_2 = 1 - (n-1)/(n-p) (1-R2)
It can be less than 0 if many irrelevant regressors are added, it is concordant with R2 if |A|=|B|
BEST = MAX

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cross validation+deleted residuals

A

CV = PRESS = Σ (yi - y^iM-i)2 / n
Deleted residuals yi - y^iM-i = ε^i / 1 - hiiM
hiiM is the i-th diagonal element of the hat matrix H = X(X’X)-1X’
BEST = MIN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Akaike Information Criterion

A

AIC = -2l(θ^)+2(p+1) =GLM= nln(σ^2ML) + 2(p+1)
We choose the model that contains the element most similar to f0
Under regularity assumptions can select the “best” model even if there is no test to evaluate it.
In GLMs if |A|=|B| we have AIC min ↔ SSE min ↔ R2 max
BEST = MIN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Bayesian Information Criterion

A

BIC = -2l(θ^) + ln(n)(|M|+1) =GLM= nln(σ^2ML) + ln(n)(|M|+1)
Puts more weight on the complexity than AIC, under regularity assumptions can select the “best” model even if there is no test to evaluate it.
In GLMs if |A|=|B| we have BIC min ↔ SSE min ↔ R2 max
BEST = MIN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Nested GLM model selection in R

A

anova(fitmin, fitmax)
- Res.Df = n - pi ∀ i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Non-nested GLM model selection in R

A

AIC(fit) #Simplified for GLMs, includes n+nln(2π)
extractAIC(fit)
extractAIC(fit, k=log(nrow(data))) #BIC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly