introduction to multilevel modelling Flashcards

1
Q

what are different names for multilevel modelling?

A

Hierarchical linear models, mixed models, random effects models and variance components models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what assumption is made by simple multiple regression that would not hold true if we had hierarchial data.

A

It assumes the data is independent, specifically that the residuals are uncorrelated with one another. If the data is in actual effect grouped and we haven’t taken this into account then this assumption will not hold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can we account for grouping effects in data? What are the problems of each

A
  1. include dummy variables for the groups
  2. Include explanatory variables that measure group characteristics believed to influence outcomes in the model as additional predictors. Effectively control for their effects

For example, let’s say you are studying academic performance in students across different schools. You suspect that the school itself may have an effect on individual student performance. In this case, you can include variables that measure school characteristics (e.g., school funding, student-teacher ratio, school size) as additional predictors in your model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what happens if we don’t take into account the cluster effect in the model

A

The standard errors of the regression coefficients will be underestimated. Consequently, confidence intervals will be too narrow and p values will be too small. risks Type 1 error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

if there is a clustered group effect in the analyses (e.g., n from same school having similar scores to n of diff schools) and is not accounted for. Which coefficinnts will this more severly impact?

A

Underestimation of standard errors is particularly severe for coefficients of predictors that are measured at the group level, eg, an indicator of whether a school is mixed or single sex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

if you are interested in only in controlling for clustering rather than exploring its effects what methods can we use vs if you are interested in exploring it.

A

If interested in exploring it – multilevel modelling is needed. Gives us correct standard errors.
If not and just treat the clustering as a nuisance for which to control – then 2 ways:
* Methods to adjust standard errors for design effects
* model the dependency between n in same group explicitly using marginal model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Multilevel modelling enables researchers to investigate the nature of between-group variability, and the effects of group-level characteristics on individual outcomes. Is this true?

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is a model with dummy variables for groups called

A

fixed effects model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

hierarhcial data. What are the consequences of: Including a set of dummy variables for groups (a fixed effects model)

limitation of this?

A

Group is treated as a fixed classification, so the target of inference is restricted to the groups represented in the sample. I

if the number of groups is large, there will be a large number of additional parameters to estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

hierarhcial data. What are the consequences of: Fitting a single-level model with group-level predictors

A

High risk of Type I errors because standard errors of coefficients of group-level predictors may be severely underestimated. No estimate of the between-group variance that remains unaccounted for by the included group-level predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

hierarhcial data. What are the consequences of: Correcting standard errors for design effects, or fitting a marginal model in which the dependency is modelled directly

A

The standard errors will be correct (properly adjusted for clustering), but unable to assess the degree of between-group variation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

hierarhcial data. What are the consequences of: Multilevel modelling (random effects)

A

Correct standard errors and an estimate of between-group variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is a typical two level model

A

Yij = b0 + b1x1ij + Uj + eij

Where,

Uj = N(0, sigma2, U)

EIJ = N(sigma2 , e)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

write the regression equation for
> simple regression model

multilevel data :
> fixed effects model (null model)
> random intercepts (null model
> random slope model

A

Simple: Yi = B0 +B1x1 + Ei

fixed: Yij = B0 + UJ + Eij

R intercepts: Yij = B0 + Uj + EiJ (Uj and Eij normally distributed)

R slope: Yij = B0 + b1xIJ + U0j + U1jx1ij + e0ij

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

limitations of fixed effects approach?

A

When number of groups is large, there will be many extra parameters to estimate. (Only one in ML model.)
For groups with small sample sizes, the estimated group effects may be unreliable. (In a ML model residual estimates for such groups ‘shrunken’ towards zero.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what makes it a multilevel model

A

soon as you include predictor of the difference in mean from the overall average, to the group specific average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what does the random intercept model help you see that teh simple regression model dowsnt

A

between-group effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Write out multilevel model for group means

A

YIJ = B0 + B1X1i + UJ + Eij

*Yij is the response for person I in group j.
*B0 – the mean score for all n across groups
*Uj – the difference between overall mean to the group specific mean
*Eij – the residual, the difference between the group mean (B0 + Uj) and the persons individual score (Yij)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what does sigma2 U reflect in a multilevel model

A

Between group variance
- based on difference between group mean from overall mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what does sigma2 e reflect in a multilevel model

A

Within group, between individual variance
- Individual differences from group mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

fixed vs random part of the multilevel regression equation

A

Fixed part
-Specifies the relationship between the mean of Y and exploratory variables
-B0 + B1XIJ
-Fixed part parameters: B0 +B1

Random part
-Contains the level 1 and 2 residuals
-Random part : UJ + eIJ
-Random part parameters: sigma2e sigma2u

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

by including a grouping variable (UJ) how have we changed the structure of the data?

A

Now 2-level structure with individual (level 1) nested within group (level 2)

23
Q

variance partition coefficient (VPC)

A

measures the proportion of total variance that is due to differences between groups:

24
Q

measures the proportion of total variance that is due to differences between groups:

A

The variance at a certain level e.g., group divided by total variance (between group + within group variance)

25
Q

for simple multilevel models, what is the VPC equal to?

A

The intra-class correlation coefficient.

if the VPC is 0.2, for example, we would say that 20% of the variation is between groups and 80% within. The correlation between randomly chosen pairs of individuals belonging to the same group is 0.2.

26
Q

interpret a VPC of 0.4?

A

40% of the variation is between groups and 60% within.

27
Q

Comparing 2 models single level model vs multilevel model. How can we test the null hypothesis – that there are no group level differences?

A

Likelihood ratio test statistic is got from:
LR = -2 log L1 – (-2 log L2)
where L1 and L2 are the likelihood values of the single-level and multilevel models respectively and ‘log’ refers to the natural logarithm

28
Q

what is the LR test statistic compared against? How many degrees of freedom do we have?

A

Chi-squared distribution with the number of degrees of freedom equal to the number of extra parameters in the more complex model.
Have 1 additional parameter in this example (between group variance) so there is 1 degree of freedom.

29
Q

what is the difference between testing for group effects and estimating group effects (multilevel modelling)

A

Testing for group effects refers to determining whether there is evidence of systematic differences between groups. e.g., likelihood ratio test.

Estimating group effects, on the other hand, involves quantifying the magnitude and direction of the effects of the group-level predictors on the outcome variable

30
Q

what are the two ways we can estimate group effects?

A
  1. Random effects approach
  2. fixed effects approach
31
Q

what is the difference between random and fixed effects approach

A

fixed effects includes dummy coded variables for each group

random effects approach includes single parameter representing group variance

main difference is the group variables in fixed approach are treat as fixed parameter while in random effects they are treat as having a normal distribution summarized by a variance parameter

32
Q

how many residuals are there in the single vs multilevel model

A

In a single level model there is a single set of residuals. The difference between n score and the predicted score of the model.

In multilevel model, we have a 2 error terms. Total variance constructed by uj + eij (between + within group variance). How can we get the estimated residual for each?

33
Q

how do we calcualte a level 2 residual

A

Step 1: get the mean raw residual
step 2: multiply the mean raw residual by shrinkage factor

34
Q

what is the shrinkage factor?

A
  • Between group variance
  • Divided by between group variance + (within group variance / sample size in group)
35
Q

do we have the same shrinkage factor for every group e.g., uk, germany and france?

A

no

Different shrinkage factor for each group

36
Q

how do we get the level 1 residual

A
  • Observed value (Yij)
  • Minus the predicted value (BO+B1X1IJ)
  • Minus level 2 residual
37
Q

How does the group sample size affect the shrinkage

A

smaller sample size = more shrinkage

38
Q

How does the within group variance affect the shrinkage

A

large variance = lot of shrinkage

39
Q

How does the between group variance affect the shrinkage

A

small variance = lot of shrinkage

40
Q

how do we present level 2 residuals

A

can use a caterpillar plot e.g., showing the level 2 residual and CI for each coutnry

41
Q

can the shrinkage factor be larger than 1

A

No the shrinkage facotor is always equal to or less than 1.

So when multiplying this by the mean raw residual the estimated residual is either equal to or less than the MRR.

42
Q

Why are shrinkage residuals also regarded precision-weighted estimates

A

These shrinkage residuals are also called precision-weighted estimates because we have taken their reliability into account in their estimation. Unreliable estimates with, for example, small nj will be shrunk towards the overall mean. Reliable estimates with a large nj will keep close to their raw mean value.

43
Q

what is the fixed effects approach to looking for group differences called?

A

Analysis of variance

44
Q

from the single level regresion equation

how many extra parameters do we have in a fixed vs random effects model measuring outcome score for 20 coutnries?

A

Fixed effects – 19 additional parameters (dummy coded for each)

Random effects – just one, the between group variance

45
Q

why is a random effects approach better if each group varied in sample size

A

*Fixed effects – generated group effects might be unreliable. Reliability of residual estimate for each group not taken into account
*Random effects - recognizes that there is little information for these groups by ‘shrinking’ their residual estimates towards zero, and therefore pulling their mean towards the overall mean

46
Q

What does it mean in terms of target of inference if fixed classifications are used vs random classifications?

A

If we are wanting to make conclusions about the specific thing chosen, e.g., treatment A vs treatment B, then variables are treat as fixed. We want to make inferences about those specific things.
In contrast if we are sampling from a wider population (e.g., schools) and the target of inference is on this wider population then random classifications are used.
Fixed effects approach prohibits generalisations made to groups beyond our sample. Random effects approach, the between group variance, is interpreted as the between group difference in the wider population.

47
Q

extend this random intercept model with no predoctors

to having 1 explanatory varibale

Yij = B0 + Uj + EJ

A

Yij = B0 + Uj + B1X1ij+ EiJ

48
Q

Yij = B0 + Uj + B1X1ij+ EiJ

random in tercept model

what is the overall relationship between x and y? and what is the intercept for group J

A
  • Overall relationship between x and y represented by line: B0 + B1XIJ
  • Intercept (start) for n in group J: B0 +UJ
49
Q

what is the random intercept model

A

Any model where the intercept of the group regression lines are allowed to take on different values from a distribution. the slope B1 is assumed to be the same for each group

50
Q

multilevel regression with single variable: what exactly is the intercept and slope for a dichotomos explanatory variable e.g., gender (coded as either 0 or 1)

A

B0 is the overall mean of Y for individuals with x=0.

BO +UJ = is the mean for individuals with x=1 in group J

the slope B1 is the difference in the mean for x=1 relative to x=0 (in any group).

51
Q

Q: what does the intercept of a model represent in single vs multilevel regression?

A

In a linear regression model, the intercept represents the value of the dependent variable when all independent variables are set to zero. It is the value of the dependent variable when there is no contribution from any of the independent variables.

However, in a random intercept model, intercept marks the DV for a group when all the independent variables are set to zero. It captures the average or typical value of the dependent variable for that particular group.

52
Q

why does ignoring clustering produce underestimated standard errors?

A

Because the analysis is conducted with the assumption that you have 5000 independent observations, all producing very similar results increasing precision.
In contrast you only have 100 independent observation so the similarity isnt as strong

53
Q

what is the number of independent observations called?

A

Effective sample size (ESS)

54
Q

in multilevel modelling, what does the effective sample size depend on

A

The degree of clustering (measured by the intra-class correlation or variance parition coefficient