Intro to multi-level models (MLM) Flashcards

Question

MLM Inference Options for inference: Approximating DDF - Kenward-rogers

Answer 1

Kenward rogers: - models must be fitted with REML - adjusts SEs to avoid small sample bias - approximated denominator df (may not be whole number)

Answer 2

Models need to be fitted with MLE Evaluates the curvature of the likelihood surface at the estimate - sharp curve = more certainty in estimate - gradual curve = less certainty

Answer 3

Models need to be fitted with MLE - not good for small sample sizes Uses anova() Compares loglikelihood of two competing models - ratio of two likelihoods is asymptotically (as n increases towards infinity) chi squared distributed

Answer 4

Parametric bootstrap: - confidence interval - likelihood ratio test Case based bootstrap - confidence interval

Answer 5

Fit with ML if: - models differ in fixed effects only - models differ in BOTH fixed and random effects - you want to use anova() Fit with REML is: - models differ in random effects only

Answer 6

The general assumptions in lm are "mean of 0 and constant variance" Remember: LINE?

Answer 7

Similar to lm - the general idea is the same of: "error is random" but now we have residuals at multiple levels! - we have our overall (fixed effects) line - then random effects lines of how much the group line is different from the overall line slope - then around the random effects lines we have the residuals of individual points

Answer 8

plot ( model , type = c ( "p" , "smooth" ) )

Answer 9

Used to check normality - we want the dots to follow the line qqnorm(resid(model)) qqline(resid(model))

Answer 10

Measure of spread - we just want the line to be horizontal (it doesn't matter where it sits) plot(model, form = sqrt(abs(resid(.))) ~ fitted(.), type = c("p","smooth"))

Answer 11

Used just to look for systematic patters

Answer 12

performance :: check_model(model) this gives us an overview but isn't used in formal write-ups

Answer 13

if assumptions look violated, check the model is correct e.g. - are the interaction terms needed - does that variable vary by cluster

Answer 14

(not massively recommended) Transforming your outcome variable may help satisfy model assumptions - but this may come at the expense of interpretability there are many methods e.g. BoxCox = finds the 'best' transformation - after this we can only refer to y as BoxCox transformed y and we have no way of knowing if our transformed y is meaningful

Answer 15

Same basic principles as in lm if we are concerned our errors are non-normal or heteroskedastic and we have a LARGE sample size, it might be a good option BUT if there are effects with mis-specification (e.g. the effect is non-linear) bootstrapping won't help

Answer 16

= resample based on the estimated distribution of parameters assumes explanatory variables are fixed and that the model specification and distributions are correct - not very helpful in assumption violations

Answer 17

y* = y hat + ε hat - sampled with replacement assumes explanatory variables are fixed and that the model specification and distributions are correct - not very helpful in assumption violations

Answer 18

= resample cases - minimal assumptions other than that we have specified the hierarchical structure of our data BUT this presents us with the issue of do we sample individual observations? do we sample clusters? or both? For example, in R to bootstrap participants (clusters) but not their observations we include resample = c (TRUE,FALSE)

Answer 19

high leverage cases = are able to direct our model line in a certain way high outlier = points that fall far from our model line and other observed data points high influence = high leverage + high outlier - the case is far from our other observed data points and pulls the model line in a direction that misrepresents the rest of the observed data - cook's distance Both observations (level 1) and clusters (level 2) can be influential

Answer 20

plot QQplot of model Diagnostics package library(HMLdiag) infl1 <- hml_influence(model, level = 1) - dot plot (of cook's distance) = points beyond the red line can be considered influential dotplot_diag(infl1$cooksd, cutoff = "internal")

Answer 21

If we have multiple observations for each participant, each participant can be considered a cluster So to determine an influential cluster: infl2 <- hml_influence(model, level = "ppt") dotplot_diag(infl2$cooksd, cutoff = "internal", index = infl2$ppt) This will provide a graph of clusters (participants) scaled on influence

Answer 22

Would our whole conclusion change if we excluded an influential case? We make a model with and without the influential case and compare them - if your conclusion does not change, don't mention it - if your conclusion changes, mention it in the discussion and look closer at the influential case to determine why it is so influential

Answer 23

We can recentre our data so that any value forms the new 0 point common versions of this are: - mean centring - centring so our data starts at 0 (so our model is not trying to predict the 0 point)

Answer 24

Suppose we have a variable for which the mean is 0 and the sd id 15 - we can change the scale of our data so 1 unit change in x is equivalent to 1 unit change in sd

Answer 25

in MLM we have multiple means to work with: Grand mean = mean of all observations (regardless of cluster) Group means = mean of each cluster Group mean centring = take each individual observation in a group and minus the group mean

Answer 26

When looking at the graph, you are looking at the differences between individual observations that contribute to one line For example, in a study of how anxiety effects drinking (for clusters) of participants "is being more nervous (than you usually are) associated with higher alcohol comsumption" To answer this, we would group mean centre anxiety to plot the clusters (participants) against their average anxiety levels

Answer 27

When looking at the graph, you look at the differences between the cluster lines For example, in a study of how anxiety effects drinking (for clusters) of participants "is being generally more nervous (than other participants) associated with higher alcohol consumption To answer this, we plot group means and then compare them

Answer 28

- when we have a predictor that varies within a cluster - when we have different average levels of x (typically occurs in observations rather than experiments) - when our RSQ concerns x

Answer 29

Things in that structure belong only to that structure e.g. children will belong to one class, in one school, in one district etc

Answer 30

Imagine the children in classes in schools example We can write their random effects structures in R a few ways: (1| school ) + ( 1 | class ) ( 1 | school ) + ( 1 | class : school) ( 1 | school / class ) = "group by school and within that group by class" - This can only be used if group labels are unique - for example, if the children in each class are labelled 1-30 we wouldn't know who was who if we used this method

Answer 31

basically the opposite of nested structures the things in one cluster can also be in other clusters e.g. we have multiple participants complete 5 tasks multiple times - observations here can be clustered by participant and by task

Answer 32

Imagine the multiple participants complete 5 tasks multiple times example We write the crossed random effects structure in R as follows: ... + ( 1 | ppt ) + ( 1 | task )

Answer 33

maximal model = the most complex structure that you can fit to the data - everything that can be modelled as a random effect is done so - everything in the fixed effects is 'up for grabs' for the random effects - requires sufficient variance (which if often not the case) in R: we use isSingular(maxmodel) to see if the model has converged to a sensible answer - if the output is TRUE there is an issue and we need to simplify our maximal model

Answer 34

Don't report results from a model that won't converge! you can't trust its estimates Instead: - check the model specification = do your random effects make sense? - try a different optimizer - adjust the max iterations - stop the tolerances In most cases our model is too complex and we just have to simplify it

Answer 35

= the method by which the computer finds the best fitting model for our MLM we can try all optimizers at once in R using: summary(allFit(model)) this may help us choose an optimizer that allows our model to converge throughout DAPR3 they've used 'bobyqa' like literally all the time

Answer 36

Use a criterion for model selection (e.g. LRT, AIC, BIC etc.) to choose a random effect structure that is supported by the data - these are parsimony corrected so have more power

Answer 37

start with the maximal model remove random effects with the least variance until the model converges this means we're trying to fit the most complicated model that we can given the variance in our data (risks overfitting)

Answer 38

* extract the random effects in R = VarCorr(maxmodel) Look for: - small variances / sd - correlations of 1 or -1 * Consider removing the more complex random effects first (e.g. interaction term) * categorical predictors with 2+ levels are 'more complex' (as they require more parameters) * remove higher level random effects if needed (if you have multiple levels of nesting, you'll have fewer groups as you go up the levels) * subjective choice = which simplification can you most easily accept

Answer 39

Removing random effects correlations simplifies the model Correlations between our random effects can alter our model results we can remove the correlations but still see the random effects by using || in our model instead of | for example: ... + ( 1 + x || y )

Answer 40

LMER = how is it measured before fitting a model think: do we need to centre or scale

Answer 41

OUTCOME what are we interested in explaining and predicting FIXED EFFECTS - what variables are we interested in explaining by this - are our questions about the effect if our predictors specifically in reference to group means of predictors? etc.

Answer 42

RANDOM EFFECTS which of our fixed effects can vary for our random groups? - does a single group have multiple distinct values of x? - what can we imagine a slope for?

Answer 43

GROUPING STRUCTURE in what different ways can we group our data? - of the ways we can group our data, which are of specific inferential interest? - of the ways we can group our data, which groupings do we think of as a random sample of a general population? - are these groupings nested? - are the labels unique? etc.

Answer 44

model issues = check for convergence and singular fit (adjust accordingly) model assumptions - can use check_model to check everything looks right - normality of residuals = QQplot - influence = cook's distance / dot plot plots are preferable as statistical tests can be overly sensitive

Answer 45

use: fixef(model) to get fixed effect values Interpretation = using plot_model to plot your fixed effect may help you make sense of the fixef(model) output Inference Tests - model comparison - parameter estimates Methods - df approximations (e.g kenward-rogers) - Likelihood ratio tests - Bootstrap We can use any of these to make inferences about our fixed effects

Answer 46

we use random effects to add context to our results For example, adding random intercepts and slopes to a graph can help provide context of the actual trends in the data - helping us answer our RSQ

Answer 47

1) data cleaning/removal/transformations are performed prior to analysis 2) unplanned transformations or removals performed in order to meet assumptions 3) specify all fixed effects (explanatory variables and covariates) linking to the RSQ/hypothesis 4) plan a structure of random effects to be fitted and the procedure used to decide final random effects structure if the model does not converge 5) state clearly relevant test / comparison and link this to RSQ / hypothesis

Answer 48

1) software packages used to fit models 2) estimation method 3) optimizer used 4) if the proposed model failed to converge, the steps used to reach final model For final model 5) all parameter estimates for fixed effects (e.g. coefficients, SEs, CIs, t-stats/df/p-values if used) Random effects 6) variance/sd for each random effect, residual variance, correlations/covariates if modelled

Answer 49

Tables help a lot but results given in the table must be included within the written interpretations to provide the reader with the context of each number's meaning

Intro to multi-level models (MLM) Flashcards

(73 cards)