L7 - Mixed-effects models Flashcards
Common reasons for clustered data? (multilevel data)
- Repeated measurement of the same person
- clustered sampling
Why worry about clustered data?
- If data points are not independent of each other this violates the assumption of independent residuals.
What happens if the assumption of independent residuals is violated?
Estimated standard errors are too small, inflated Type I error (we reject H0 too quickly).
What is ecological fallacy?
Incorrectly generalize the pattern on the aggregate level to the individual level.
- spurious patterns
- hidden patterns
Multi-level problem visualized
Intraclass correlation (ICC)
Ratio of variance between groups and variance within groups
When do you have a high ICC?
When do you have a low ICC?
Are mixed-effects models, hierarchical models, and multilevel linear models the same?
yes
What is fixed effects?
Population-level average effects (main effect or interaction)
What is random effects?
Random variability of lower-level units (subjects, items) around a fixed effect.
Write down the formula for a mixed-effects model
How does the co-variance matrix look like of random effects?
What is an example of the by-subject random intercept?
here we measure how the individual participants deviate from the aggregate measure.
How to decide between a mixed-effects model and an intercept-only model?
In the context of your lecture slides, the Intraclass Correlation Coefficient (ICC) of .54 provides evidence of dependence between observations within a group. An ICC value close to 1 indicates strong dependence, while an ICC value close to 0 indicates weak dependence. In this case, the ICC value of .54 suggests that there is moderate dependence between observations within a group, making it appropriate to use a mixed-effects model.
What is partial pooling?
Due to the hierarchical structure of the mixed-effects model, the individual parameters (random effects) are informed by the group-level parameter (fixed effects)
What is shrinkage?
Extreme values are pulled towards the mean of the distribution because they are more likely to be unreliable. Shrinkage is stronger when there is little data on the individual level.
Is the variability of random effects in mixed-effects models smaller than the variability of individual parameters in non-hierarchical regression?
Yes
Two criteria for model fit
- Likelihood ratio test
- Information criteria
What are two information criteria?
- Akaike information criterion (AIC)
- Bayesian information criterion (BIC)
What do AIC and BIC tell you
How well the model trades off complexity and model fit
Do random slopes increase model complexity?
Yes
Is a higher AIC and BIC better?
no it is worse
What does the fixed intercept in a mixed-effects model refer to?
Mean of the search efforts across people
What does the fixed slope tell you in the mixed-effects model?
e.g. .09 –> By increasing the IV by 1 we get .09 more search effort = DV
How do you conduct a statistical evaluation for regression coefficients?
t = b/SEb
b is the estimated regression coefficient
What is the problem of computing degrees of freedom in mixed-effects models?
You need N df = N -2
The question is what is N in the mixed-effects model. Number of observations? Number of subjects?
Why is calculating df a problem for the p-value?
You need df for the t-value of the regression coefficient. There is ambiguity about what the correct p value is.
What do the Satterthwaite and Kenward-Roger approximation do?
They approximate the degrees of freedom
What canbe used for model comparison?
- likelihood ratio test
- parametric bootstrap
what is t-as-z approach?
if t>2 it is significant
can you use mixed-effects logitstic regression
yes
How many levels should the random-effects grouping factor (participants, items) have to reliable estimation of random effects?
at least 6 levels (# people)
What can you do if you have fewer than 6 levels?
One might consider running a regular multiple regression with the grouping factors as a dummy-coded covariate