introduction to multilevel modelling Flashcards
what are different names for multilevel modelling?
Hierarchical linear models, mixed models, random effects models and variance components models.
what assumption is made by simple multiple regression that would not hold true if we had hierarchial data.
It assumes the data is independent, specifically that the residuals are uncorrelated with one another. If the data is in actual effect grouped and we haven’t taken this into account then this assumption will not hold.
How can we account for grouping effects in data? What are the problems of each
- include dummy variables for the groups
- Include explanatory variables that measure group characteristics believed to influence outcomes in the model as additional predictors. Effectively control for their effects
For example, let’s say you are studying academic performance in students across different schools. You suspect that the school itself may have an effect on individual student performance. In this case, you can include variables that measure school characteristics (e.g., school funding, student-teacher ratio, school size) as additional predictors in your model.
what happens if we don’t take into account the cluster effect in the model
The standard errors of the regression coefficients will be underestimated. Consequently, confidence intervals will be too narrow and p values will be too small. risks Type 1 error.
if there is a clustered group effect in the analyses (e.g., n from same school having similar scores to n of diff schools) and is not accounted for. Which coefficinnts will this more severly impact?
Underestimation of standard errors is particularly severe for coefficients of predictors that are measured at the group level, eg, an indicator of whether a school is mixed or single sex
if you are interested in only in controlling for clustering rather than exploring its effects what methods can we use vs if you are interested in exploring it.
If interested in exploring it – multilevel modelling is needed. Gives us correct standard errors.
If not and just treat the clustering as a nuisance for which to control – then 2 ways:
* Methods to adjust standard errors for design effects
* model the dependency between n in same group explicitly using marginal model
Multilevel modelling enables researchers to investigate the nature of between-group variability, and the effects of group-level characteristics on individual outcomes. Is this true?
yes
what is a model with dummy variables for groups called
fixed effects model
hierarhcial data. What are the consequences of: Including a set of dummy variables for groups (a fixed effects model)
limitation of this?
Group is treated as a fixed classification, so the target of inference is restricted to the groups represented in the sample. I
if the number of groups is large, there will be a large number of additional parameters to estimate
hierarhcial data. What are the consequences of: Fitting a single-level model with group-level predictors
High risk of Type I errors because standard errors of coefficients of group-level predictors may be severely underestimated. No estimate of the between-group variance that remains unaccounted for by the included group-level predictors.
hierarhcial data. What are the consequences of: Correcting standard errors for design effects, or fitting a marginal model in which the dependency is modelled directly
The standard errors will be correct (properly adjusted for clustering), but unable to assess the degree of between-group variation.
hierarhcial data. What are the consequences of: Multilevel modelling (random effects)
Correct standard errors and an estimate of between-group variance
what is a typical two level model
Yij = b0 + b1x1ij + Uj + eij
Where,
Uj = N(0, sigma2, U)
EIJ = N(sigma2 , e)
write the regression equation for
> simple regression model
multilevel data :
> fixed effects model (null model)
> random intercepts (null model
> random slope model
Simple: Yi = B0 +B1x1 + Ei
fixed: Yij = B0 + UJ + Eij
R intercepts: Yij = B0 + Uj + EiJ (Uj and Eij normally distributed)
R slope: Yij = B0 + b1xIJ + U0j + U1jx1ij + e0ij
limitations of fixed effects approach?
When number of groups is large, there will be many extra parameters to estimate. (Only one in ML model.)
For groups with small sample sizes, the estimated group effects may be unreliable. (In a ML model residual estimates for such groups ‘shrunken’ towards zero.)
what makes it a multilevel model
soon as you include predictor of the difference in mean from the overall average, to the group specific average
what does the random intercept model help you see that teh simple regression model dowsnt
between-group effects
Write out multilevel model for group means
YIJ = B0 + B1X1i + UJ + Eij
*Yij is the response for person I in group j.
*B0 – the mean score for all n across groups
*Uj – the difference between overall mean to the group specific mean
*Eij – the residual, the difference between the group mean (B0 + Uj) and the persons individual score (Yij)
what does sigma2 U reflect in a multilevel model
Between group variance
- based on difference between group mean from overall mean
what does sigma2 e reflect in a multilevel model
Within group, between individual variance
- Individual differences from group mean
fixed vs random part of the multilevel regression equation
Fixed part
-Specifies the relationship between the mean of Y and exploratory variables
-B0 + B1XIJ
-Fixed part parameters: B0 +B1
Random part
-Contains the level 1 and 2 residuals
-Random part : UJ + eIJ
-Random part parameters: sigma2e sigma2u