Lecture 1, 2 and 3 Flashcards
What are the 6 reasons for conducting exploratory data analysis?
- Checking for data entry errors
- Obtaining a thorough descriptive analysis of your data
- Examining patterns that are not otherwise obvious
- Analysing and dealing with missing data
- Checking for outliers
- Checking assumptions
What are the two options for dealing with data entry errors?
- Remove data
2. Make “educated guess” about what was intended
How to examine patterns that are not otherwise obvious?
- Stem and leaf plots
* box and whisker plots
What does screening and cleaning involve?
- computing new variables from existing ones
- recording variables
- dealing with missing data
How to check for data entry errors in categorical/nominal variables?
Frequencies command
How to check for data entry errors in continuous/ scale variables?
The outliers option in the explore command
What the normality assumption?
Assumed that your data comes from population that is normally distributed.
What does homogeneity of variance assume?
Assumed that, if your data is to be divided into groups, the level of variability in the groups will be approximately equal (e.g., not significantly different)
What are the four ways normality is tested?
- Visual inspection of histograms and stem and leaf plots
- Visual inspection of normality and detrended normality plots
- Normality tests
- Skewedness divided by SE skewness
What are two reasons to recode data?
- Reducing numbers of groups
2. Reverse scoring