Multilevel classification Flashcards
Data we get are made of a number of classifications or levels e.g, classes.
There are relationships between classifications that we can model, they take one of 3 forms, what are these?
> Nested – e.g. n within school. Each student belongs to 1 school.
> Cross classified
-e.g., students lie within cross classification e.g., between school and area.
-Within one school diff students come from different areas
-Likewise within one area student’s can go to different schools
-So school is not nested within area, and area is not nested within school.
> Multiple membership
two level nested structure. Students within schools
which is level 1 and 2?
students (level 1 ) nested within schools (level 2)
students nested within schools. How many units do I need in each classification?
Depends on the targteet of inference.
If interested in the units themselves e.g., if it’s a particular school you care about then you need lots of n per school.
If interested in between-school differences in general then you need a lot of schools to get a reliable estimate
3 variables: school, school type (state or private) and student. Which of these are explanatory variables and which are levels? Why would school type not be regarded a level?
A classification/level is regarded random if it has been randomly sampled from a wider population and if the conclusions are to generalise (target of inference). Can randomly select schools from wider population, can randomly select subset of students but can’t randomly select state vs private from random sample. Only these two exist. They are therefore fixed.
How does a variable beinf regarded fixed or random affect the analysis?
If it is to be regarded a level in a multilevel model, then it has to be random
decision also linked to target of inference
Multilevel structure can be imposed in two ways, what are they?
- By levels that exist within the population e.g., patients within hospitals
- By the study design and data collection
how can two-level nested structures can arise from research design?
- Repeated measures, panel data
- Multivariate designs
- Multistage survey designs
- Intervention studies where the intervention is made at the group level
What form does data need to be in to analyse repeated measures data
long form
what is the difference between long and wide form data structures
long form- i.e. it has one row per measurement occasion. However, repeated measures data often come in wide form- i.e. there is one row per individual with a column for each measurement occasion.
does the multilevel framework require balance across the groups?
Example: repeated measures data.
multilevel approach does not require that each individual is measured on every occasion, that is, the multilevel framework does not require balance
what do we mean by Multivariate responses within individuals
When n have 2 response variables. E.g., student with exam AND math score as the response variable, gender being the predictor .
how does multivariate structure differ from multilevel structure?
Multilevel structure, 2 levels in the data e.g., individuals nested within schools. Contrasts with multivariate, where there are 2 responses that we reported.
what is a multistage design
A multistage design is a type of sampling design that is often used in multilevel studies to account for this clustering effect. In this design, neighborhoods or places are selected first, and then individuals are selected within those neighborhoods or places.
multistage designs however generate dependent data.
what is the design effect?
something that we use to get the correct SE when we have dependednt data.
What problems arise when dependent data is treat as though they are independent?
Results in incorrect estimates of precision (standard errors being too low) and an increased risk of type 1 error