Model Selection Flashcards

1
Q

Prediction & explanation

A

If aim is to predict future outcomes:
- variables in model = covariates
- predictive model
- variable selection

If aim is to explain / find causal relationships:
- variable in model = explanatory variables
- explanatory models
- NO automatic variable selection strategies allowed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

AIC or AICc or BIC

A

AICc is for low n
AICc must be used when n/p < 40
But is also in general better

BIC penalizes for complexity

In R:
AICc()
StepAIC()

-> coefficients of such models should not be interpreted bc model selection may lead to biased parameter estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Workflow for Explanatory, confirmatory model

A
  • clear hypothesis
  • select x according to a priori knowledge
  • formulate only 1/few models before analysing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Confirmatory vs. Exploratory

A

Confirmatory:
- Clear hypothesis & a priori selection of regressors for y.
- No variable selection!
- Allowed to interpret the results and draw quantitative conclusions.

Exploratory:
- Build whatever model you want, but the results should only be used to generate new hypotheses, a.k.a. “speculations”.
- Clearly label the results as “exploratory”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How many variables include in model?

A

Not more than n/10 (10% of n)
Otherwise overfitting
(Categorical variables with k=3 already use up 2 parameters)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Collinearity of covariates

A

If can be written as linear combination of others
-> x1 = x2 -> slopecoefficients cannot be uniquely determined
-> sd too high
-> p-values too high

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to detect collinearity

A

Variance inflator factor (VIF)
VIF = 1 / 1-Rj^2

If Rj^2 big -> high collinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What to do against collinearity

A
  • Avoid it
  • not include variable in unacceptable high VIF
  • be aware of it
  • interpret results with care

-> Note: collinearity in predictive model no problem -> AIC eliminates it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Preregistrate in explanatory models

A
  • what transformations i would try
  • what model simplifications will be considered
  • How i deal with outliers
  • How i treat missing values
  • how i treat collinear variables
    -> analyse data following this protocol
  • fit model and check if assumptions are met
  • if assumptions are not met, adapt the model as outlined in protocol
  • Interpret model coefficients and p-values properly

-> any additional analyses: exploratory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Post hoc/ a posteriori variable selection

A

AIC
BIC
Dredge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Problem with collinearity

A

The standard errors of the parameter estimates are too large -> thus p-values are too large (conservative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly