Module 2 - Examining Data Flashcards
Wide format
What is a Data dictionary?
Why do we need to examine data?
important
What are Data Entry Errors?
“Impossible values”: E.g., a person’s age of 200 y, height 1.8 cm
Wrong coding: E.g., two groups (0, 0, 0, 0, 0, 1, 1, 1, 1, A, 0, 1)
What are Outliers?
Extreme or rare values, located far away from the distribution in comparison to the other scores in the same distribution
What is Missing Data?
Data not captured, impossible to read, or lost (blank cells)
What are 3 tools for examining data?
- Tables
- Graphs
- Statistics
What are 2 Types of Tables?
- Frequency Distribution Table
- Stem and Leaf Table
Frequency Distribution Table
Frequency distribution table
Example with a ratio variable (missing data)
Frequency distribution table
Example with an ordinal variable with categories
When is a frequency distribution Table used?
- Used for all levels of measurement (NOIR)
- Detects entry errors and missing data
- More useful for Nominal and Ordinal with categories
- Can be tedious with ratio variables
Stem and Leaf Table
Steam and Leaf Table Example
Score = 79, frequency = 1
Score = 86, frequency = 4
Score = 100, frequency = 1;
Score = 101, frequency = 3
When are Stem and Leaf Tables used?
Should be used for interval and ratio variables
- Similar structure to make a histogram
Not recommended for nominal or ordinal variables