Measurement Error & Mixed Models Flashcards
Sources of measurement error (ME)
- Measurement imprecision in the field or in the lab (length, weight, blood pressure, etc.).
- incomplete or inaccurate observations (e.g., self-reported dietary aspects, health history).
- Rounding error, digit preference.
- Classification error (e.g., exposure or disease classification).
-> assumption is that x is measured without error for corr, regr, anova, glm
If this assumption is violated:
- biased estimates
- loss of power
- incorrect variable importance
- masks important features of the data -> making graphical model inspection difficult
Measurement error
Xi = correct unobserved variable
Wi = observed variable with error
-> slope is flacher
How to correct ME
Need error model & error model parameters
-> take repeated measurements to estimate error variance
Attenuation factor lambda = sdx^2 / (sdx^2 + sdu^2)
SIMEX
Simulation phase:
- error in the data is progressively aggravated in order to determine how the model parameter of interest is affected.
Extrapolation phase:
- simulated trend is then extrapolated back to a hypothetical error-free value of the model parameter.
Practical advice
- Think about measurement error before you start collecting your data.
- Ideally, take repeated measurements
- Figure out if error is a problem and what the bias in your parameters might be. You might need simulations to find out.
- If needed, model the error. Seek help from a statistician!
Mixed models
Not each observation is independant datapoint!!
Taking average would destroy a lot of data
Problem: Df not clear -> thats why significance tests of fixed effects are difficult when using mixed models
Why not take average
- To adjust estimates for imbalanced sampling.
- Which average? (Mean/median/mode?)
- Avoid false confidence. (Due to removing variance)
- To study variation.
- To adjust estimates for repeat sampling (sharing information)
- To keep information.
Fixed effects
Sex
Height
Weight
Size
…
-> Groups / levels are predetermined, of direct interest, repeatable
Random effects
Individual
Nest
Family
-> Groups that are randomly sampled from a larger population of groups
Mixed Models in R
Lmer()
Package: lme4
Many or just one measure
- measurement of 1 RT spread more in direction of x axis
- slope of 1 RT less steep
- measurement of only 1 RT is morre variable than the mean of 5 RTs
Why prefer a multilevel model rel to analysing averages?
- to avoid false confidence
- bc we would retain variation that may be of interest
- averaging would require an arbitrary decision about which average to use
Why is ME in covariates of regression models problematic?
- parameter estimates may be biased
- it is a fundamental assumption of regression models that covariates do not contain any error
- it is much harder than to find patterns by visual inspection
- the p-values of the regression coefficients may be wrong