M7 - MLM Flashcards

1
Q

Part B - Question:

Which of the following is a potential cluster and why?

Models of cars by Holden.
Schools that students attend.
Health ratings at Time 1 of each individual within one’s sample, health ratings at Time 2, health ratings at Time 3, etc.

A

Schools that students attend.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Part C - Question: Dr Pangloss has collected data on student experience of bullying at school (intended independent variable) and their level of engagement in classroom activities (intended dependent variable). Data were collected from students across 15 classes in the same school.

Which of the following statements is correct?

  1. Obtaining class-level averages for both variables would allow Dr Pangloss to evaluate whether average level of bullying in a classroom is predictive of average level of student engagement in a classroom..
  2. Dr Pangloss does not need to use multilevel modelling because all students were recruited from the same school..
  3. Dr Pangloss does not need to use multilevel modelling because the IV and DV are both Level 1 variables.
A

Obtaining class-level averages for both variables would allow Dr Pangloss to evaluate whether average level of bullying in a classroom is predictive of average level of student engagement in a classroom..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Part D - Question: The following graph depicts…

3 parallel lines going bottom left to top right diagonally. Starting at different points on the Y axis and at beginning of x axis

  1. Random intercept but fixed slopes.
  2. Random slopes but fixed intercepts.
  3. An instance where multilevel modelling is unnecessary.
  4. Random slopes and random intercepts
A
  1. Random intercept but fixed slopes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Part G - Question: The intra-class correlation indicates…

  1. How much of the variance in the DV is due to within-group variance.
  2. How much of the variance in the DV is due to between-group variance.
  3. The amount of variance in the slope for the relationship between an IV and DV
A
  1. How much of the variance in the DV is due to between-group variance.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is multilevel modelling (MLM) and when/why would it be used?

A

A flexible framework to model relationships between variables with clustered data. Does the relationship we are interested in work consistently across the groups

Can be used to

1) check for variance across clusters
2) correct for differences if needed
3) ponder why these differences exist - depending on our variables, we might be able to test hypotheses as to why these groups differ

Multiple IVs, 1 DV - both can be continuous or categorical, although categorical can be difficult to work with
- use with around 30 clusters or more

Clusters can vary on the DV
Clusters can vary on the IV-DV relationship
- Either sort of variance makes running standard MR problematic

Can be

  • multilevel regression
  • multilevel path analysis
  • multilevel factor analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define “clustering”.

A

Different groups perform differently across different groups
- also referred to as nesting or hierarchies
eg classes within schools
eg schools within state
eg cases reside over by judges
eg time point within participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define “intra-class correlations (ICCs)”

A

Intra-class correlations tests the suitability /need for a random intercept in MLM

ICC = between cluster variance / (between+within cluster variance)
= between cluster variance/ all cluster variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define “centering”.

A

Centering of L1 predictors is a way to separate within- from between cluster variance in the IV

Allows TWO questions to be asked about our data due to knowing within clusters (L1) variance and between cluster (L2 level averages) variance

group means centering approach bets for L1 predictor
No need to centre the DV as the random intercept does this for us
Centreing at L2 need to be grand mean (unless it is part of a L3 hierarchy. ie the top level needs to be grand means centred

Grand mean centering - mean score across all participants on an IV and subtracting that variable from the original score
Group mean centering - group mean score across participants in their respective groups and subtracting that variable from the original score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are Level 1 variables? What are Level 2 variables? How do they differ?

A

Level 1 variable occurs at the bottom of the hierarchy
- eg student performance, student motivation
Level 2 is the variable L1 is nested in
- eg average student performance, teacher experience, class size
Level 3 is the variable L2 is nested in
- eg average school performance, school size, principal experience

Different levels will have different variables and different amounts of information. More information at lower levels

When testing performance
Level 1 will have the greatest number in 1 state with 20 schools, each school has 10 classes and 25 students.
Total students = 1 x 20 x 10 x 25 = 5000
Total classes = 1 x 20 x 10 = 200
Total school = 1 x 20 = 20
Total state = 1

For repeated measures design with people
Level 1 might be repeated estimates of mood per individual at multiple time points throughout day
Level 2 might be general feeling of mood on extroversion/neuroticism scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain random vs. fixed effects.

A

Fixed effects is when a parameter (either the intercept b0 or the slope b1) is consistent across clusters.
Can also think of it as the average effect across all clusters - Assumes 1 equation is sufficient to explain the data

Fixed effect useful where the IV–>DV relationship is different across clusters but has the same magnitude (ie different intercept, same slope)
Also where the clusters do not create much of a difference at intercept or slope so you can get away with averaging

Random effect
The intercept and/or slope varies across groups. The more the clusters differ, the more likely we need to use random effects

Every model will have fixed effect. The fixed effect give us the basis to determine statistically whether we need to use a random effect.

Test need for random intercept - ICC test for cluster difference on DV
Test need for random slope - significance test of random effect for L1 relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain cross-level interactions.

A

Cross level interactions is when you want to incorporate the L2 predictor to explain the differences in random intercept or slope where the intercept or slope is being treated as the DV

eg random intercept
IV–>DV is L2 predictor –>b0
Y = b0

eg random slope
IV–> is L2 predictor –>b1
Y= b0 + b1*X1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do relationships among variables differ for same or different levels?

A

Relationship can be L1 to L1
eg student motivation –> student test performance

L2 –> L1 eg Teacher experience –> Student test performance prediction?
Will need to look at Teacher experience –>average student test performance (between groups differences/ie class) to get comparable number of data points to make predictions

L2 and L1 –> L1
eg effect of student motivation and teacher experience on student test performance?

Two questions are asked
Level 1 is predicting the within-cluster version of the DV
-Does student motivation on average predict their performance
Level 2 is predicting the between cluster version of the DV
- does teacher experience predict the average performance of a class

L2 IVs can also be moderators of L1 DVs
Snijder and Boskers 20102 formula produces non-negative estimates:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens if you dont properly deal with clustered data?

A

Ignoring clusters between groups

  • just treating as single level regression
  • increased Type 1 error rate (false positive)
  • N overestimated

Recognise clusters, aggregating to higher level

  • changes research question to one about averages and between groups
  • N reduces (as taken from higher level), and so does power, so Type 2 error rate increases
  • ignores variability within class, assumes group average represents individual performance (ecological fallacy). means we could find a positive result when the opposite is true, or vice versa
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is ecological and atomistic fallacy?

A

Ecological fallacy is when you assume that the group’s average performance is representative of an individuals performance

Atomistic fallacy is when the individual’s performance is assumed to represent the group’s average performance (eg treating a L2 predictor as if it were an L1 variable, falsely inflating n to L1 amount)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe different ways of dealing appropriately with clustered data.
1) Adjustment to standard error approach

A
1) Adjustment to standard errors 
eg Huber White Sandwich estimator
adjust our standard errors to recognise the level of clustering in the data (address Type 1 error inflation)
equation = effect/SE 
- tells us whether effect is significant

the smaller the clustering effect, the closer the sample size will head towards L1 n (6)
The larger the clustering effect, the closer the sample size will head towards L2 n (2)

  • Doesn’t change RQ to one of averages like aggregating to higher level does
  • Less likely to find significant effect
  • more accurate
    Problems
    –>correct differences but doesn’t allow it to be a focus
    –>Doesn’t permit L2 predictors or interactions between L1 and L2 predictors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe different ways of dealing appropriately with clustered data.
2) MLM approach

A

2) MLM approach
- Retains research question as impact at individual level
- Less likely to find significant effect
- more accurate
- Can evaluate whether L1 IV->L1 DV varies across clusters (L2) - Random effect
If random effect found, can use L2 to predict variation (cross level interaction)

17
Q

Cross level interactions: How do we incorporate L2 predictors with the random intercept?

A

Incorporating Level 2 predictors – Random intercept
Y=b0

Takes the b0 intercept of each cluster (equivalent to the average) and treats it as the DV, with L2 predictor as the IV.
-RQ will be does L2 predictor predict average DV

Intra-class correlations are used to test the need for random intercept.
Assumption: Random intercept exists (i.e., clusters differ on the intercept) – test with ICC.
18
Q

Cross level interactions: How do we incorporate L2 predictors with the random slope?

A

Incorporating Level 2 predictors – Random slope
Y = b0 + b1*X1

Take b1 and treat as DV, use Level 2 predictors to explain differences in random slope.
Does teacher experience predict (classroom differences in the) strength of relationship between motivation and performance?
Assumption: Random slope exists (i.e., slopes differ across clusters) – test with significance test for random slope.

19
Q

What are the MLM Assumptions?

A
outliers
normality
linearity
homoscedasticity
multicollinearity --> .6 is cause for a concern compared with .8 or .9 for MR
  • Random sampling of participants
  • Random effects are normally distributed
  • Suitabilty of MLM (ICC is significantly larger to call for random intercept
  • Sufficient # of groups and participants within groups for good power
  • increasing # of groups at L2 more important than increasing # of individuals within groups
  • recommendations range for 20-50 groups needed
  • Bayesian estimation can help with small N at level 2
20
Q

Do you need a random slope?

A

If you only have L2 predictor - dont need at that level (unless you have 3 level scenario)

If you have one or more L1 predictors - you need to check

Evaluate with significance test
p< .05 means IV–>DV relationship strength from one group to the next is significantly different from a scenario where the relationships are identical across groups
–>retain fixed slope and random slope

If p>.05 just retain fixed slope (as strength of relationship between IV–DV across groups do not differ significantly from each other)

In SPSS - Use the UNSTRUCTURED approach by default

21
Q

Describe the MLM strategy four step process

A

Modelling Strategy

In MLM, we want to know whether the random intercept and slope(s) are necessary, and also determine relationships among variables in our model.

Can’t do this in one go; best to build up one step at a time:

Step 1: Model with just DV and clustering variable
• If ICC < .05, default to standard regression and ignore Steps 2‐4
Step 2: Include Level 1 predictors as fixed effects
Step 3: Add random slopes, one at a time
Step 4: Include Level 2 predictors

Calculate the Pseudo R2 values of interest

22
Q

Do you need a random intercept?

A

ICC tests whether a random intercept is needed

One approach

  • test whether b0 sig different from zero
  • if p
23
Q

How do you work out effect size in MLM?

A

Complex in MLM
- adding predictors can increase unexplained variance into the model

MR variance is just in the DV
In MLM variance is:
- DV variance within group
- DV variance between group
- Total DV variance altogether
- Variance in random effects
24
Q

How do you assess your Model in MLM?

A

Global assessments
- Have you chosen IVs that are related to the DV? How - - – much variance in your model is explained?*

Evaluation of individual effects

  • Which IV is most important for prediction of the DV?
  • Which IV is most important for explaining random components of model?