W7: Intro LMM Flashcards by Val Y

What is the assumption that can be relaxed when using linear mixed models instead of linear regression models?

The assumption of independent observations and residual errors

How well did you know this?

Not at all

Perfectly

What are 2 examples of non-independent observations used by LMMs?

Repeated measures (e.g longitudinal studies)
Individuals as clusters / groups (e.g people within families / schools = cluster within higher order unit)

How well did you know this?

Not at all

Perfectly

In an intercept only model, what does the intercept represent?

Unconditional (not conditioned on predictors) expectation of y
Same as the mean of y

How well did you know this?

Not at all

Perfectly

What are the predictors in this equation:
lm( hp ~ 1 + mpg, data = mtcars) )

1 (the intercept) and mpg (explanatory variable)

How well did you know this?

Not at all

Perfectly

What does inclusion of fixed intercept assume for the mean of residuals?

Mean of residuals will be 0

How well did you know this?

Not at all

Perfectly

What are 2 reasons we fit a constant (fixed intercept) in models?

Errors will be unbiased
Regression line will be fit to find its own intercept in a way that minimizes the mean squared error (i.e distance between regression line and all data points)

How well did you know this?

Not at all

Perfectly

What happens if we make the constant 0?
lm ( hp ~ 0 + mpg, data = mtcars)

There would be no intercept, leads to biases

How well did you know this?

Not at all

Perfectly

What format do we want our LMM data to be?

Long format
1 ID have multiple rows
IDs can have different number of rows

How well did you know this?

Not at all

Perfectly

What are 3 conditions of RM data to be met in order to use RM ANOVA to analyze them?

Discrete time points (E.g T1, T2, T3)
Everyone has the same number of time points
Outcome is continuous, normally distributed data

How well did you know this?

Not at all

Perfectly

What are 4 things RM ANOVA can’t handle?

Continuous time (if 1 person completes day 0, 13, 22 and another 0, 1, 20)
Continuous predictors (e.g age in years)
Missing data on any time points (completely excluded unless imputation)
Non-linear outcomes

How well did you know this?

Not at all

Perfectly

What are 2 variations linear regression can’t capture for non-independent data?

Different intercepts (mean) by ID (between person variation)
Different slopes (r-ship between predictor + outcome) by ID (within person variation)

How well did you know this?

Not at all

Perfectly

Linear regression has 1 fixed intercept and 1 fixed slope which violates what assumption if it’s used to analyze RM data? This also means it can’t capture what kind of effect?

Violates assumption of independence
Can’t capture random effects (different regression coefficient across people)
Coefficient includes intercept and slope

How well did you know this?

Not at all

Perfectly

What are 2 other names for LMMs?

Multilevel models (MLMs)
Hierarchical linear models (HLMs)

How well did you know this?

Not at all

Perfectly

Why are LMMs called mixed?

Includes both
Fixed effects (reg coeffs identical for everyone) +
Random effects (reg coeffs vary randomly for each ppt)

How well did you know this?

Not at all

Perfectly

When do you use H (hierarchical) LMs?

When you have multiple hierarchical levels (different levels of nesting, e.g kids nested within classroom / obsv nested within ppl)

How well did you know this?

Not at all

Perfectly

All HLMs are LMMs.
Are all LMMs HLMs?

How well did you know this?

Not at all

Perfectly

In an intercept only linear regression model, what is the intercept (mean) assumed to be for all IDs?

Identical (fixed)

How well did you know this?

Not at all

Perfectly

What is the equation for linear regression, intercept only model?

yi = b0 * 1 + ei

How well did you know this?

Not at all

Perfectly

What is the value of M and SD for fixed effects?

M = estimated mean
SD = 0 (no variation in everybody’s intercept, identical)

How well did you know this?

Not at all

Perfectly

What is the value of M and SD for random effects?

M = estimated mean
SD = estimated SD (SD is free to vary, can be > 0, individual variations in intercept/mean)

How well did you know this?

Not at all

Perfectly

With mixed models, the total variance is composed of 2 variabilities:
Between (intercept) +
Within (slope) person variations.
The ratio of between variance to total variance is captured by what?

Intraclass correlation coefficient (ICC)
Varies from 0 to 1

How well did you know this?

Not at all

Perfectly

What does ICC = 0 indicate?

Study These Flashcards

All variability occurs within individuals
individual means are identical (no between variation)

What does ICC = 1 indicate?

Study These Flashcards

All variability occurs between individuals
Individual means differ
Within individuals, all values are the same

What does ICC of 0.40 tell us?

Study These Flashcards

40% of total variance occur between people
60% of total variance occur within people

What function do you use in R to calculate ICC?

iccMixed ("dStress", id = "ID", data = d)

The residual output from iccMixed indicates what?

Within person variance

Is lower ICC more or less stable from day to day than higher ICC?

Less stable (more within variation)

How do you interpret the mean/intercept for between effects /variation?

Mean/intercept is constant/the same for single person across days Each person has a different mean/intercept

How do you interpret the mean/intercept for within effects /variation?

Mean/intercept has daily fluctuations (changes across days) I.e individual deviations from individual's own mean (aka residual variance)

What is the equation for linear mixed, random intercept only model?

yij = b0j * 1 + eij

Explain each component of the linear mixed, intercept only model equation: yij = b0j * 1 + eij

yij = outcome with observations for specific unit (j), at specific time point (i) - assumed to follow normal distribution b0j = estimated intercept (fixed + random) for each unit (j)

What is the equation for b0j (random intercept)? b0j = ___ + ___

* b0j = y00 (mean (fixed) intercept) + u0j (individual unit deviations (random) from y00)

How many parameters does intercept only linear mixed model have: yij = b0j * 1 + eij?

3 (fixed intercept, SD of individual intercepts, residual errors)

Besides assumption that observations don't have to be independent, what is the another new assumption of LMMs?

Random intercept assumed to follow normal distribution (bc added new parameter of SD of individual intercepts)

Linear regression model uses lm(), what function does LMMs use instead?

lmer()

What is the equation for stress predicted by fixed and random intercept using lmer()?

lmer( stress ~ 1 + (1 |ID), data = d)

What does this function show: fixef(x)

Seeing fixed effects coefficients only 1 intercept value

What does this function show: coef(x)

Seeing random effects coefficients only Many intercept values

What is shrinkage and what does it do to individual estimates?

Difference between model estimated intercept (BLUPs) and actual (raw) mean across IDs Random intercept tend to shrink individual estimates towards overall fixed effect estimate

What are best linear unbiased predictors (BLUPs) an estimation of?

random effects including shrinkage

Under which 2 conditions are the degree of shrinkage (difference between BLUPs and raw means) largest?

1. More extreme intercepts (people whose intercept = further away from fixed intercept) 2. People with less data points within ID

What function do you use to check diagnostics of LMMs?

plot(modelDiagnostics( x, ev.perc = .001) )

What are the 3 plots from model diagnostics for model with random intercept?

1. Density plot of residuals (assumption of normally distributed residuals + identify outliers) 2. QQ plot of residuals (assumption of homoscedasticity/equal variance) 3. Density plot of random effects - titled "ID: (intercept)" Assumption that random effects (intercept coefficients) are normally distributed

What function do you use to calculate each person's mean (between) and deviations from their mean (within variables)?

dd [ !is.na(ID), c ( "Bstress" , "Wstress") := meanDeviations (dStress), by = "ID")

When cleaning for extreme values for stress, should we start with between or within variables first?

Within Extreme stress value at within level will affects between level stress value

What is step 1 of cleaning extreme values, starting with examining within level stress data?

Plot/examine distribution of Wstress data plot (testDistribution (dd [!is.na (ID)$Wstress, extremevalues = "theoretical, ev.perc = .005 )

What is step 2 of cleaning EVs?

Subset extreme values (pick only rows with EVs) testDistribution(dd2$Wstress, extremevalues = "theoretical", ev.perc = .005)$Data[isEV == "Yes"]

After removing EVs for within level data, what do we do?

Recreate between and within person data using meanDeviations because removing EVs on within level changes between level (average) values

When examining between level data after cleaning within level data, which rows should we use?

Doesn't matter (between level value = same for all IDs) Just remove rows where ID is duplicated: dd.noev[ ! duplicated( ID ) ]

If there are EVs for within level data, do we exclude entire IDs or specific days?

Specific days

If there are EVs for between level data, do we exclude entire IDs or specific days?

Entire IDs (and all rows associated with that ID)

b0j = y00 + u0j. u0j follows what distribution with mean and SD of what?

* u0j assumed to follow normal distribution (mean = 0, SD = SD of deviations)

Will people with more data will have a BLUP closer / further to the observed mean of their own data / average mean of all people in an intercept only model?

Closer to observed mean of own data

Will people with less data (say only 1-2 observations) will have a BLUP that is closer / further to the observed mean of their own data / average mean of all people?

Closer to average mean of all people (assumed that the mean of their 1-2 data points is likely very noisy/inaccurate due to small sample size)

W7: Intro LMM Flashcards

(54 cards)