Data Exploration Flashcards

1
Q

What is EDA?

A

EDA: Exploratory Data Analysis
Descriptive Statistics
Graphical
Data-driven

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is CDA?

A

CDA: Confirmatory Data Analysis
Inferential statistics
EDA and theory-driven

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to describe data?

A

Describe data

  • Case
: A single object with several variables be measured 
E.g. A person, an email
  • Variable: 
A property expressed as number or category
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Types of data

A

Types of data

  • Qualitative (categorical)
    • Nominal scales (the number is just a symbol that identifies a quality)
 - Ordinal — rank order
  • Quantitative (continuous and discrete)
    • Interval (unites are of identical size — e.g. years)
 - Ratio (distance from an absolute zero — e.g. age)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Variables measures

A
Variables
Measure of tendency
Mean (average value)
Median (middle value)
Model (most frequent value)

Measure of variability
Variance (spread around the mean)
Standard deviation
Standard error of the mean (estimate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Outliers and errors: what are and how to fix them

A
  • Mistake collection phase (typo)?
  • Actually data from outside of targeted population?
  • Multiple distributions?
  • Simple chance?
  • Complications

What to do?

  • If you find a mistake: fix or delete
  • If you find an outlier: trim winsome or delete
  • If you distribution is skewed: transform data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Interpret Standard Deviation (SD)-

A

Interpreting standard deviation (SD)

  • SD will let you know about the distribution of the scores around the mean
  • High SD (relative to the mean) indicate the scores are spread out
  • Low SD tell you that most scores are very near to the mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly