Week 7 Flashcards
What do you use to test a random pattern of missing data?
Littles MCR
Explain Littles MCR:
Littles compares your data against a matrix which comprises of data that is missing complete at random. It sees how much your data set differs from this matrix. You want your data to differ so you can say that your data is not missing completely at random.
If you have a small % of missing data you can use?
Listwise or pairwise deletion.
Explain Listwise?
Deletes the whole case for any case that has any missing data. None of it will be used in the analysis.
Explain Pairwise
, just removes the case in any analysis that requires the data.
So if you a person filled out question 1 and 2 but not 3. Pairwise would not use that persons data only for the analysis that involves question 3.
Explain mean substitution
Replaces the missing data with the mean.
Explain estimation by regression
which treats the missing data like a DV. It utilises the info that is known about that person with missing data.
Explain Expectation Maximization
It fills in the missing data according to a normal distribution of the data. In order to do this the data must be missing at random.
Explain Multiple Imputation
Makes no assumptions about the randomness of the missing data. It is the gold standard of filling in the missing data. However it is more complex than other methods.
Violation of Assumptions introduce the following bias into the analysis:
- Biased parameter estimate
- Biased standard errors and confidence interval
- Biased test statistics and p-values
What are the 2 main types of outliers?
Univariate and Multivariate
Explain Multivariate Outliers
A person who’s pattern of scores on two or more variables is very different from the sample. LeBron, when the combination of the variables are outliers.
Explain Univariate Outliers
A persons score on ONE variable is very high or very low compared to the other participants
Explain LINEARITY
The assumption is that there is a straight line relationship between two variables. You need to check for nonlinear relationship between the variables. You do this by using scatterplots.
Explain NORMALITY
The assumption is not that the distribution of variables in the data set have to be normal