Intro into Multi-level data Flashcards

1
Q

What is the difference between level-1 & level-2 variables?

A

level-1 variables vary at level 1 (i.e. different SES levels of students)

level-2 variables cannot vary at level 1 (i.e. every student in school has the same student to teacher ratio, same school type, size, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How are mutli-level and panel datasets similarly structured?

A
  • PISA, SOEP, etc. (i.e. students nested in schools)
  • longitudinal data in general (timepoints nested in individuals)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why do we need special methods for multi-level data?

1) Correct statistical inference

A

“Dependence as nuisance“ (Snijders & Bosker): Basic assumptions of regression/inferential statistics are violated –> no

  1. Independent observations
  2. Independent error terms
  3. Homoscedastic errors
  4. Normal distribution of errors

Examples

  • Exam scores are more similar within classes
  • Political attitudes cluster in regions
  • Measurements of body weight are correlated over time

ml/longitudinal data highly correlated
which means we cannot evaluate statistical uncertainty appropriately as our standard errors are getting too small (the larger the sample the smaller the standard error but our sample is artificially inflated with not independent observations aka denominator is largely than it is supposed to be) –> make SE smaller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why do we need special methods for multi-level data?

2) Substantial questions

A
  • Dependencies/correlations within clusters as a subject matter (e.g., how much of variance in grades can we attribute to differences between schools?)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why do we need special methods for multi-level data?

3) Dealing with unobserved heterogeneity

A

Due to hierarchical data structure we have unobserved heterogeneity at different levels which can affect the relationships between variables at those levels.
By capturing and addressing this heterogeneity - through a RE - researchers gain a deeper understanding of how group-level factors influence individual outcomes + more accurate estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Possible level-2 variables influencing math performance of students?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can the meaning of a level-1 variable change when its aggregated?

Male -> Math performance

A
  • male level-1: pressure from parents, teacher, socialization → pos influence on math performance
  • male level-2 (% of boys in class): more disturbance in class → neg influence on math performance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What could be level-1 and level-2 confounders of motivation -> math performance?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can we split variance?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do we model the mean?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is in the residuals/error term?

Model for the mean

A

unobserved heterogeneity/omitted variable, in here are all the factors that influence the outcome but are not in the model
also, those factors are assumed to be random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is special about the residuals?

Model for the mean

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we need to modify the model for the mean for multi-level data?

A

Split the variance -> 2 error terms:

Result: Unobserved heterogeneity on both levels
* Unobserved Level-1 factors
* Unobserved Level-2 factors
Both assumed to be random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the variance partition coefficient + how do u calculate it?

A

Variance Partition Coefficient (VPC), also: as the Intraclass Correlation Coefficient (ICC)

Statistical measure to quantify the proportion of variance in a dependent variable that is attributable to different levels of the data hierarchy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Empty model mscore: Stata code

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How much variance is between vs within schools?

A
17
Q

Why do we need special methods for nested data?

Mentimeter

A