Chapter 2 Flashcards
What are the two types of data analysis?
Exploratory and explanatory
What is exploratory?
Looking for patterns
What is explanatory?
Communicate the conclusion. They have to designed in such a way where it is easy to interpret.
What are the four characteristics of a good plot?
- Show the data
- Make patterns easy to see
- Display magnitudes honestly
- Draw graphics clearly
What are the characteristics of a bad plot?
Hiding the data: variations and densities
(Unable to see patterns, unable to say anything about the population - how many)
How to shaw patterns?
Scales and arrangements/ Appropriate scales and arrange in a reasonable way
“zoom” into the more relevant scales
adjust s axis in such way that shows patters
What is representing magnitudes dishonestly?
Start y-axis at 0, height of bar plot imply magnitude
What are two things to consider when drawing graphical elements clearly?
Is it readable? and considering a diverse (colorblindness) audience
What is the difference between a bar graph and a histogram?
Bar graph contain space between bins and histogram does not. The x axis of a bar graph represent categorical variable while histogram, represents numerical variable
What is a pie chart and why is it not as good?
Display frequencies of a categorical variable. The issue with pie chart is that the “slices” are hard to see when that frequency is small/low
What kind of shapes can histogram have?
Skewness: right/positive skew and left/negative skew
What are types of histograms and density plots?
Bell shaped, bimodal, skewed, and uniform
What is an outlier?
An extreme observation
Showing association between two categorical variable (categories different from each other)
Contingency table, Grouped bar graph, Mosaic Plot
Contingency Table
Frequency of occurrence of all combinations of two or more categorical variables
Grouped Bar Graph
Heights display frequency distributions of 2 or more categorical variables
Mosaic Plot
Areas of rectangles display the relative frequency of occurrences of all combination of 2 or more categorical variables
Showing Associations between two numerical variables
Scatter plot: positive association, negative association, and absent (no) association
Showing association between one numerical and one categorical variables
Strip chart: each observation is a dot, they are jittered
Multiple histograms: stack vertically
How to show trends in time or/and place?
Line graph and map
Line graph
- Connect the dots
- Trends
- Rate og change
Map
- Color gradient
- Explanatory variables = location in space