introduction to multilevel modelling Flashcards
what are different names for multilevel modelling?
Hierarchical linear models, mixed models, random effects models and variance components models.
what assumption is made by simple multiple regression that would not hold true if we had hierarchial data.
It assumes the data is independent, specifically that the residuals are uncorrelated with one another. If the data is in actual effect grouped and we haven’t taken this into account then this assumption will not hold.
How can we account for grouping effects in data? What are the problems of each
- include dummy variables for the groups
- Include explanatory variables that measure group characteristics believed to influence outcomes in the model as additional predictors. Effectively control for their effects
For example, let’s say you are studying academic performance in students across different schools. You suspect that the school itself may have an effect on individual student performance. In this case, you can include variables that measure school characteristics (e.g., school funding, student-teacher ratio, school size) as additional predictors in your model.
what happens if we don’t take into account the cluster effect in the model
The standard errors of the regression coefficients will be underestimated. Consequently, confidence intervals will be too narrow and p values will be too small. risks Type 1 error.
if there is a clustered group effect in the analyses (e.g., n from same school having similar scores to n of diff schools) and is not accounted for. Which coefficinnts will this more severly impact?
Underestimation of standard errors is particularly severe for coefficients of predictors that are measured at the group level, eg, an indicator of whether a school is mixed or single sex
if you are interested in only in controlling for clustering rather than exploring its effects what methods can we use vs if you are interested in exploring it.
If interested in exploring it – multilevel modelling is needed. Gives us correct standard errors.
If not and just treat the clustering as a nuisance for which to control – then 2 ways:
* Methods to adjust standard errors for design effects
* model the dependency between n in same group explicitly using marginal model
Multilevel modelling enables researchers to investigate the nature of between-group variability, and the effects of group-level characteristics on individual outcomes. Is this true?
yes
what is a model with dummy variables for groups called
fixed effects model
hierarhcial data. What are the consequences of: Including a set of dummy variables for groups (a fixed effects model)
limitation of this?
Group is treated as a fixed classification, so the target of inference is restricted to the groups represented in the sample. I
if the number of groups is large, there will be a large number of additional parameters to estimate
hierarhcial data. What are the consequences of: Fitting a single-level model with group-level predictors
High risk of Type I errors because standard errors of coefficients of group-level predictors may be severely underestimated. No estimate of the between-group variance that remains unaccounted for by the included group-level predictors.
hierarhcial data. What are the consequences of: Correcting standard errors for design effects, or fitting a marginal model in which the dependency is modelled directly
The standard errors will be correct (properly adjusted for clustering), but unable to assess the degree of between-group variation.
hierarhcial data. What are the consequences of: Multilevel modelling (random effects)
Correct standard errors and an estimate of between-group variance
what is a typical two level model
Yij = b0 + b1x1ij + Uj + eij
Where,
Uj = N(0, sigma2, U)
EIJ = N(sigma2 , e)
write the regression equation for
> simple regression model
multilevel data :
> fixed effects model (null model)
> random intercepts (null model
> random slope model
Simple: Yi = B0 +B1x1 + Ei
fixed: Yij = B0 + UJ + Eij
R intercepts: Yij = B0 + Uj + EiJ (Uj and Eij normally distributed)
R slope: Yij = B0 + b1xIJ + U0j + U1jx1ij + e0ij
limitations of fixed effects approach?
When number of groups is large, there will be many extra parameters to estimate. (Only one in ML model.)
For groups with small sample sizes, the estimated group effects may be unreliable. (In a ML model residual estimates for such groups ‘shrunken’ towards zero.)
what makes it a multilevel model
soon as you include predictor of the difference in mean from the overall average, to the group specific average
what does the random intercept model help you see that teh simple regression model dowsnt
between-group effects
Write out multilevel model for group means
YIJ = B0 + B1X1i + UJ + Eij
*Yij is the response for person I in group j.
*B0 – the mean score for all n across groups
*Uj – the difference between overall mean to the group specific mean
*Eij – the residual, the difference between the group mean (B0 + Uj) and the persons individual score (Yij)
what does sigma2 U reflect in a multilevel model
Between group variance
- based on difference between group mean from overall mean
what does sigma2 e reflect in a multilevel model
Within group, between individual variance
- Individual differences from group mean
fixed vs random part of the multilevel regression equation
Fixed part
-Specifies the relationship between the mean of Y and exploratory variables
-B0 + B1XIJ
-Fixed part parameters: B0 +B1
Random part
-Contains the level 1 and 2 residuals
-Random part : UJ + eIJ
-Random part parameters: sigma2e sigma2u
by including a grouping variable (UJ) how have we changed the structure of the data?
Now 2-level structure with individual (level 1) nested within group (level 2)
variance partition coefficient (VPC)
measures the proportion of total variance that is due to differences between groups:
measures the proportion of total variance that is due to differences between groups:
The variance at a certain level e.g., group divided by total variance (between group + within group variance)
for simple multilevel models, what is the VPC equal to?
The intra-class correlation coefficient.
if the VPC is 0.2, for example, we would say that 20% of the variation is between groups and 80% within. The correlation between randomly chosen pairs of individuals belonging to the same group is 0.2.
interpret a VPC of 0.4?
40% of the variation is between groups and 60% within.
Comparing 2 models single level model vs multilevel model. How can we test the null hypothesis – that there are no group level differences?
Likelihood ratio test statistic is got from:
LR = -2 log L1 – (-2 log L2)
where L1 and L2 are the likelihood values of the single-level and multilevel models respectively and ‘log’ refers to the natural logarithm
what is the LR test statistic compared against? How many degrees of freedom do we have?
Chi-squared distribution with the number of degrees of freedom equal to the number of extra parameters in the more complex model.
Have 1 additional parameter in this example (between group variance) so there is 1 degree of freedom.
what is the difference between testing for group effects and estimating group effects (multilevel modelling)
Testing for group effects refers to determining whether there is evidence of systematic differences between groups. e.g., likelihood ratio test.
Estimating group effects, on the other hand, involves quantifying the magnitude and direction of the effects of the group-level predictors on the outcome variable
what are the two ways we can estimate group effects?
- Random effects approach
- fixed effects approach
what is the difference between random and fixed effects approach
fixed effects includes dummy coded variables for each group
random effects approach includes single parameter representing group variance
main difference is the group variables in fixed approach are treat as fixed parameter while in random effects they are treat as having a normal distribution summarized by a variance parameter
how many residuals are there in the single vs multilevel model
In a single level model there is a single set of residuals. The difference between n score and the predicted score of the model.
In multilevel model, we have a 2 error terms. Total variance constructed by uj + eij (between + within group variance). How can we get the estimated residual for each?
how do we calcualte a level 2 residual
Step 1: get the mean raw residual
step 2: multiply the mean raw residual by shrinkage factor
what is the shrinkage factor?
- Between group variance
- Divided by between group variance + (within group variance / sample size in group)
do we have the same shrinkage factor for every group e.g., uk, germany and france?
no
Different shrinkage factor for each group
how do we get the level 1 residual
- Observed value (Yij)
- Minus the predicted value (BO+B1X1IJ)
- Minus level 2 residual
How does the group sample size affect the shrinkage
smaller sample size = more shrinkage
How does the within group variance affect the shrinkage
large variance = lot of shrinkage
How does the between group variance affect the shrinkage
small variance = lot of shrinkage
how do we present level 2 residuals
can use a caterpillar plot e.g., showing the level 2 residual and CI for each coutnry
can the shrinkage factor be larger than 1
No the shrinkage facotor is always equal to or less than 1.
So when multiplying this by the mean raw residual the estimated residual is either equal to or less than the MRR.
Why are shrinkage residuals also regarded precision-weighted estimates
These shrinkage residuals are also called precision-weighted estimates because we have taken their reliability into account in their estimation. Unreliable estimates with, for example, small nj will be shrunk towards the overall mean. Reliable estimates with a large nj will keep close to their raw mean value.
what is the fixed effects approach to looking for group differences called?
Analysis of variance
from the single level regresion equation
how many extra parameters do we have in a fixed vs random effects model measuring outcome score for 20 coutnries?
Fixed effects – 19 additional parameters (dummy coded for each)
Random effects – just one, the between group variance
why is a random effects approach better if each group varied in sample size
*Fixed effects – generated group effects might be unreliable. Reliability of residual estimate for each group not taken into account
*Random effects - recognizes that there is little information for these groups by ‘shrinking’ their residual estimates towards zero, and therefore pulling their mean towards the overall mean
What does it mean in terms of target of inference if fixed classifications are used vs random classifications?
If we are wanting to make conclusions about the specific thing chosen, e.g., treatment A vs treatment B, then variables are treat as fixed. We want to make inferences about those specific things.
In contrast if we are sampling from a wider population (e.g., schools) and the target of inference is on this wider population then random classifications are used.
Fixed effects approach prohibits generalisations made to groups beyond our sample. Random effects approach, the between group variance, is interpreted as the between group difference in the wider population.
extend this random intercept model with no predoctors
to having 1 explanatory varibale
Yij = B0 + Uj + EJ
Yij = B0 + Uj + B1X1ij+ EiJ
Yij = B0 + Uj + B1X1ij+ EiJ
random in tercept model
what is the overall relationship between x and y? and what is the intercept for group J
- Overall relationship between x and y represented by line: B0 + B1XIJ
- Intercept (start) for n in group J: B0 +UJ
what is the random intercept model
Any model where the intercept of the group regression lines are allowed to take on different values from a distribution. the slope B1 is assumed to be the same for each group
multilevel regression with single variable: what exactly is the intercept and slope for a dichotomos explanatory variable e.g., gender (coded as either 0 or 1)
B0 is the overall mean of Y for individuals with x=0.
BO +UJ = is the mean for individuals with x=1 in group J
the slope B1 is the difference in the mean for x=1 relative to x=0 (in any group).
Q: what does the intercept of a model represent in single vs multilevel regression?
In a linear regression model, the intercept represents the value of the dependent variable when all independent variables are set to zero. It is the value of the dependent variable when there is no contribution from any of the independent variables.
However, in a random intercept model, intercept marks the DV for a group when all the independent variables are set to zero. It captures the average or typical value of the dependent variable for that particular group.
why does ignoring clustering produce underestimated standard errors?
Because the analysis is conducted with the assumption that you have 5000 independent observations, all producing very similar results increasing precision.
In contrast you only have 100 independent observation so the similarity isnt as strong
what is the number of independent observations called?
Effective sample size (ESS)
in multilevel modelling, what does the effective sample size depend on
The degree of clustering (measured by the intra-class correlation or variance parition coefficient