Page 8 - Preparing for Analysis Flashcards
Before conducting a statistical analysis you need to check your data for..
- Accurate of data entry
- Missing date
- Outliers
- Normality
- Linearity, homoscedasticity, homogenity of variance
- Independence
- Multicollineratity and singularity (MANOVA and multiple regression)
- Other assumptions
What can you use to check Data entry?
SPSS Procedures
or, Frequencies
You must consider the … and … of missing data
Amount - pattern
Amount of data missing
If more than 5% check patterns
What kind of patterns of missing data?
MCAR (missing completely at random)
MAR (Missing at random)
MNAR (Missing not at random)
MAR
A pattern of missingness predictable from other variables in the data set
MNAR
A pattern of missingness related to the variable itself
Litte’s MCAR test - when is data MCAR?
If p value is above 0.05 (non-significant difference from MCAR test/mean)
Missed data can be checked by?
List-wise deletion
Mean substitution
Expectation maximisation
Multiple imputation
List-wise used when
Few cases missing
Variables not critival to your analysis
Data are missing at random
Missing data on a different variable
Mean substitution
Replacing value with the mease of cases across items
Not highly recommended, can skew mean
Expectation maximisation
Estimated the shape of the distribution and infering the liklihood the value falling with that distribution
Most simple and reasonable with random missing data
Multiple imputation
Used regregression to predict values based on other variables in your dataset
Most respectable, can be used a MNAR MCAR
More difficult
An outlier is…
A case with such an extreme value on one variable (univariate) or such a strange combination of score on two or more variables (multivariate) that is distorts statistics
Can lead to type 1 (false positive) and type 2 (false negative) results
When can an outlier occur?
Participant interpreted question incorrectly
Experiementer eorror
Participants answer comes from different population
Population of participants has extreme values and is not normally distributed
Checking univariate outliers
Frequency distribution in histogram
Box-plots
Normal probability plots
Calculating standariised scrores (Z-scores +- 3.29