Summarizing Data Flashcards
1
Q
Describe univariate data
A
- 1 variable
- For describing a distribution
2
Q
Describe bivariate and multivariate data
A
- 2 / 3 or more respectively
- For exploring relationships between variables
3
Q
Graphs that can be used for any type of categorical or quantitative data:
A
-
Categorical:
- Bar graph
-
Quantitative:
- Histogram
- Dotplot
- Boxplot
4
Q
Best graph for: 2 quantitative (and 1 categorical sometimes)
A
- Scatterplot
5
Q
Best graph for: 2 categorical
A
Two-way table
6
Q
Best graph for: 1 quantitative, 1 (or 2) categorical
A
- Stripchart
- Boxplot
7
Q
How do you determine skew?
A
- mean > median = right skew
- Boxplot: mean is above the median
- mean < median = left skew
- Boxplot: mean is below the median
8
Q
Define standard deviation
A
- Standard deviation: measure of spread about the mean
9
Q
Describe the interquartile ranges of a boxplot
A
- Each line is 25%
- However, we cannot define any other percentage (ex., 85th percentile)
- Q1: median of numbers to the left
- Q2: median
- Q3: median of numbers to the right
10
Q
What is the 5 number summary?
A
Minimum Q1 M Q3 Maximum
11
Q
What is IQR and what is it used for?
A
- IQR = Q3 - Q1
- 1.5xIQR = outliers
12
Q
Define missing completely at random (MCAR)
A
- MCAR: missing value is truly random from the population (accidentally skipped the question, accidentally dropped the test tube)
13
Q
Define missing not at random (MNAR)
A
- MNAR: likelihood of missingness is associated with particular values (low test shore = higher chance of it missing)
14
Q
Define missing at random (MAR)
A
- MAR: a second variable influences the likelihood of missingness, but not the variable itself (females not reporting weight)
15
Q
Describe the 3 ways to handle missing values
A
-
Deletion
- Complete case: get rid of the individual, decrease sample size
- Pairwise: e.g. 8 individuals, only 7 data points for income
- Imputation: replace with the mean/most frequent/predicted value (controversial)
- Treat as a new category (e.g. a choice of colour is now ‘N/A’)