Multilevel Modelling Flashcards by Naomi L

What is multilevel data?

Data that has clustering at different hierarchical levels e.g. different pupils in different schools which are based in different areas.

How well did you know this?

Not at all

Perfectly

What are 3 types of multilevel data?

Geographic: – e.g data of country will have data from different regions

Relational- relationships between different factors e.g., Schools, classrooms, pupils
Countries, regions, towns, families
Company, office, team, individuals

TemporalLongitudinal: Different time points

How well did you know this?

Not at all

Perfectly

Where do we find multilevel data?

Naturally occurring e.g schools
Due to sample design e.g. clustered, multi-stage)
Repeated measures

How well did you know this?

Not at all

Perfectly

What is one way of differentiating between data that is multilevel and data that is not?

If can identify different hierarchies then that is a good example of multilevel data

If data is a single level with subgroups e.g. office with different ethnicities, this would not be considered multi level data

How well did you know this?

Not at all

Perfectly

What are two reasons for computing multilevel analysis?

1.Clustering as a problem - Sometimes we have data that is clustered and must take this into account

Clustering as a substantive interest- Answer multilevel hypotheses

How well did you know this?

Not at all

Perfectly

Why can clustering be a problem?

Clustered data violate the assumptions of simple random sampling.
Traditional multiple regression techniques treat the units of analysis (e.g. individuals) as independent observations.
If we fit hierarchical data with simple OLS models, (i.e. ignoring any clustering) the standard errors of our regression coefficient will be underestimated, leading to an overstatement of statistical significance.

How well did you know this?

Not at all

Perfectly

Why can clustering be a problem?

Clustered data violate the assumptions of simple random sampling.
Traditional multiple regression techniques treat the units of analysis (e.g. individuals) as independent observations.
If we fit hierarchical data with simple Ordinary least-squares (OLS) models, (i.e. ignoring any clustering) the standard errors of our regression coefficient will be underestimated, leading to an overstatement of statistical significance.

How well did you know this?

Not at all

Perfectly

Why can clustering be a substantive interest?

We are often interested in estimating the amount of variation between groups, and the extent to which it can be explained by group-level explanatory variables.

E.g., Is there between-school variability in students’ academic progress?
Do health outcomes vary across areas?
Are between-area variations in health explained by differences in access to health services?
Is the amount of variation between areas different for rural and urban areas?

How well did you know this?

Not at all

Perfectly

What are alternative data analysis strategies if clustering is not of substantive interest?

Fixed effects regression

Design-based modelling

Adjusted standard errors - Used most often when need to adjust for potential clustering

How well did you know this?

Not at all

Perfectly

According to Richard McElreath what should multilevel regression be the default approach?

Improved estimates for repeat sampling
Improved estimates for imbalance in sampling
Estimates of variation
Avoid averaging, retain variation

How well did you know this?

Not at all

Perfectly

What software can be used for multilevel analysis?

Stata

Julia

Python

Stan

brms

How well did you know this?

Not at all

Perfectly

What terms are classified under multilevel modelling?

Multilevel model
Random effects model
Mixed model
Random coefficient model
Random parameter models
Hierarchical model
Nested models
Split-plot designs
Subject specific models
Variance component models

How well did you know this?

Not at all

Perfectly

What is variance a measure of and how is it defined?

Variance is a measure of how spread out your data is.

Defined as the average of the squared differences from the mean.

How well did you know this?

Not at all

Perfectly

What can we think of a residual as?

As a measure of error, or how far off your predicted value was from the actual value.

How well did you know this?

Not at all

Perfectly

In a single level Ordinary least-squares regression what is the Bo represented by?

Bo - coefficient

How well did you know this?

Not at all

Perfectly

What does a least squares regression select?

What is the value called?

Study These Flashcards

The line with the lowest total sum of squared prediction errors.

Sum of Squares of Error, or SSE.

What does the notation yi = B0 +B1xi + Ei mean?

Study These Flashcards

Presents single level model.
I represents number of individuals in sample

What does the notation yij +B0 + B1xij+ Eij mean?

Study These Flashcards

Two level notation, outcome y for individual(s) in cluster j(ranges from 1 to however many clusters we have. If there was a three level model we would include convention k to represent this.

With clustered data, the residuals within each cluster are likely to be correlated. What does this mean?

Study These Flashcards

Individuals within a cluster are likely to be more similar to one another than they are to individuals from another clusters

Explain what the formula yij = B0 + uj + Eij mean?

Study These Flashcards

Uj represents distance from overall intercept B0 to group intercept

Eij represents distance from group level mean to observed data point at individual level

What is meant by partitioning variance?

Study These Flashcards

Having two sources of variance - between groups variance and within group variance

What does we assume about epsilons?

Study These Flashcards

They are normally distributed with a mean of 0 and variance of sigma squared e

What does the variance partition coefficient measure?

Study These Flashcards

Proportion of total variance
is due to differences between groups

What is the formulae for variance partition coefficient?

Study These Flashcards

Between group variance divided by total variance(between group variance plus within group variance)

What is the variance partition coefficient sometimes but not always equivalent to?

Intraclass correlation coefficient

What are two caveats of fitting a two level model and what is a possible solution?

- The random intercepts model introduced additional complexity. - Did it result in a significant improvement in model fit? - We can test this with a likelihood-ratio test

What does a likelihood ratio test examine?

Group effects

What is the formulae for a likelihood ratio test?

D = -2 Log L2 (Fit of multilevel model) - -2 Log L1(Fit of single level model) The deviance statistic (D) is compared against a X2 distribution with D.F. degrees of freedom.

What is the degrees of freedom?

The difference in the number of parameters in the multilevel vs. individual level model).

Multilevel Modelling Flashcards

(29 cards)