Week 4 - Preparing for Analysis Flashcards

1
Q

What two bits of information should every research paper include in terms of missing information?

A
  1. The extent and nature of missing data

2. The procedures used to manage the missing data, including the rationale for using the method selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are three patterns of missingness when it comes to missing data?

A

MCAR - missing completely at random
MAR - missing at random (a dummy variable to used to determine this)
NMAR - not missing at random (when there is a pattern - nonignorable nonresponse)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is listwise deletion?

A

Cases with any missing values are deleted from analysis (complete case analysis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is pairwise deletion?

A

The maximum amount of available data is retained. Cases are only excluded from operations which missing data is required (available case analysis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is mean substitution?

A

Missing values are imputed with the mean value of that variable (this method reduces the variance of the variable, which also attenuates covariances that the variable has with other variables).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is regression substitution?

A

A regression equation based on the nonmissing data is use to predict expected values for the missing data (its a best guess but produces biases in the variances and covariances).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between stochastic and nonstochastic imputation methods?

A

Stochastic means having a random probability distribution or pattern that may be analysed statistically but may not be predicted precisely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is pattern-matching imputation (two types)?

A

Two types

  1. Hot-deck: values are imputed by finding participants who match the case with missing data on other variables
  2. Cold-deck: a variation of the above where information from external sources is used to determine the matching variables.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is stochastic regression?

A

A random value is added to the imputed predicted value. (reduces biased variance estimates)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is expectation maximisation (EM)?

A

Two steps

  1. Values for the parameters are obtained with available data. Regression methods are used to impute, on the basis of these initial values.
  2. After this, new values for the parameters are calculated with the newly imputed data along with the original observed data. The process repeats until the estimates changes very little from one iteration to the next.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is maximum likelihood?

A

Strategies where observed data are used to estimate parameters, which are then used to estimate the missing scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is multiple imputation?

A

Several imputed data sets are created. Analysis is carried out on the data sets with parameter estimates. Final results are obtained by averaging the parameter estimates across the multiple analyses. These are then used to calculate construction of confidence intervals around the parameter estimates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is full information maximum likelihood (FIML)?

A

It estimates parameters on the basis of the available complete data as well as the implied values of the missing data given the observed data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is central limit theorem? (CLT)

A

As your sample size becomes bigger, the closer we get to a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly