F14 Multilevel modeling II Flashcards
What is a generalized linear model?
A model that is linear in its parameters but not coefficients. A link function is used to predict the parameter of a distribution instead of an outcome.
What is an example of a GLM?
The outcome is a count measure.
DGP: Poisson
Restriction: my > 0
Linear regression is problematic because a regression can return negativ outcome values (there are no negative counts).
What are three components of a GLM?
A suggested probability distribution (e.g. binomial for binary outcomes)
A link function to model the parameter (probit/logit for binary)
A set of linear predictors used in the link function
How can random effects be used in panel data models instead of fixed effects and whats an important assumption?
If the Hausman test is not significant then RE can be used instead preserving degrees of freedom.
Exogeneity assumption: No correlation of unit-level effects with set of predictor variables
What are three problems with fixed effects that random effects help to overcome (in panel data)
1) Fewer degrees of freedom
2) No estimation of whether higher-level variance is significant
3) We cannot measure the effects of time invariant variables at unit level - all confounders are absorbed by FE
How can you check if the assumption for random effects is met?
You can visually inspect the distribution of FE
What is does the Hausman test examine? From Bell & Jones (2015)
Are random effects valid in with respect to panel data? Does the efficiency-gain outweigh the consistency-loss?
What is a key advantage of random effects compared to fixed effects?
They are more efficient regarding the beta-estimate.
Efficiency of the reflects that I need less information/degrees of freedom to provide my key estimate.
With a lower variance around the beta, we are closer to the true estimate in general.
What is a key advantage of fixed effects compared to random effects?
They are generally more consistent regarding the beta-estimate (gold standard in panel data).
As we gather more data, our estimator will more closely approximate the true underlying value.
If it’s part of the identifying strategy FE should typically be used.
What is the trade-off between using fixed and random effects?
Efficiency (RE) vs. consistency (FE).
RE is preferable if they’re both consistent, but FE is home safe as it is always more consistent.
What is the null-hypothesis of the Hausman test and the test statistic?
H0: RE is as consistent as FE.
W = ((β_FE - β_RE)^2) / (Var(β_FE) - Var(β_RE))
The difference in beta-estimates squared scaled by the difference in variance.
W is distributed under the chi-squared distribution (k=1).
Can group-level variables be included as independent variables in multilevel models?
Yes - can help to reduce the unexplained variance between groups.
Explain a relevant case for a model with varying slope but not varying intercepts
Samples draws from a common underlying population with same baseline values. We only expect variation in effect.
Different treatment intensities to the same population.
How does multilevel model with non-nested data work?
Same logic but you can include different random intercepts from groups j and k.
Can multilevel models work with GLM?
Yes