Week 12 - Missing Data Analysis Flashcards

1
Q

List wise deletion

A

Dropping participant from the analysis who don’t have complete scores on all the variables in the model

Need nonmissing scores on all variables

It reduces the sample size and makes it harder to find a significant effect
Lower power, waste data and exacerbate bias ( especially when data is nonignorable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Balanced vs unbalanced data

A

Balanced design - same number of cells in the analysis

- make computation easier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why are scores missing initially?

A

Participant factor - mortality, attrition ( in longitudinal design)
Experimenter factor - clerical error, malfunction
Balance between not coercing people into giving answers and making it to easy to respond

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are old approaches of missing data

A
List wise deletion
Pairwise deletion
Mean substitution ( mean imputation)
Regression imputation
Last value carried forward
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Pairwise deletion

A

Only available for correlation and factor analysis

Use all cases available for each pair of variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Regression Imputation

A

Replace missing data with predicted score from regression based on all available cases

Standard error too small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Last value carried forward

A

No longer valid

Approach to longitudinal design

Attrition (drop out) lose data point ( if drop out of the third wave, 2nd wave score will replace third wave)

Intention to treat analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Problems with old approaches

A

Underestimating error variance
SE too small
CI too narrow
Type 1 error too high

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Previous approach to missingness

A

Lessen the impact of missingness (nuisance factor)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Rubin and little approach to missingness

A

Estimate missingness statistically

Mechanism of missingness is important

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Types of missingness

A

Ignorable and nonignorable

Ignorable - fewer constraints on type of analysis, reduced bias however still have problems with power (problems with precision)

Non ignorable - listwise deletion will lead to problems with bias and precision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Three Types of missing data

A

MCAR - Missing completely at random
MAR - Missing at random
MNAR - Missing not at random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

MCAR

A

Ignorable

Probability of it being missing on a given variable is not conditional on itself or on other variables in the data set

Cause of missingness completely outside of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

MAR

A

Ignorable

Probability of being missing on given variable not conditional on itself but IS conditional on other variables in the data set

eg. older people less likely to respond to question on sex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

MNAR

A

Non-ignorable

Probability of being missing on given variable is conditional on itself , missingness predicted what would have been said

eg. embarrassed to answer question because of what it would have been (often effect the outcome variable)

Can lead to big bias

Problematic when trying to estimate population prevalence of behaviour or state (people who are too sick, too drunk are missing from the analysis because of what they would have answered)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Approaches to Missing Data

A
Listwise deletion (MNAR)
Multiple Imputation 
Direct maximum likelihood (Modelling approach)
17
Q

Listwise deletion for MNAR

A

Can be used however if high proportion is missing will lead to low power
Bias unacceptable for MAR and MNAR
- Will under or over estimate the regression weights

18
Q

Direct Maximum Likelihood

A

Use all available data and the modelling is built into the procedure

Statistical way to deal with missingness

Used in SEM and mixed effects regression

Can be used on MAR

19
Q

Multiple Imputation

A

Fill in missing data with values that include extra random variance

Overcome barriers of using the sample mean or other variance (Impoverished scores that will lead to type 1 error - underestimation)

Can us for MAR

20
Q

Steps to Multiple Imputation

A

1) Do regression imputation where missing scores are replaced with predicted scores from the regression on all available cases (Imputation model)
2) Add random error to the imputed score
3) repeat the process on seperate times to make (m) of these data sets
4) run desired stats on each of the (m) data sets
5) Take the model parameters of interest (M and b-weight) average them and use the SE to calculate the statistical test

21
Q

Can you tell which type of missingness is present?

A

Can only test MCAR
- if planning to use listwise deletion

No way of telling for MAR and MNAR
- Have to use logical and theoretical understanding

22
Q

What is the best way to deal with missing data?

A

Good clear items that have been piloted
Follow up on non-responders
Mandatory questioning

if missingness below .5 approach does not matter
-0 unlikely to make difference between parameter estimates