Describing Data Flashcards
1
Q
Qualitative description properties - 3
A
- Shape
- Location
- Dispersion
2
Q
Shape - 5 (DEF + LIST)
A
- Make a smooth approximation of the histogram
- Shape of smooth curve can give idea of data distribution
- Symmetrical
- Right/Left Skewed
- Uniform
3
Q
Location - 1
A
- The position of the peak on the x axis
4
Q
Dispersion - 1
A
- How much the data spreads on the x axis
5
Q
Numerical summeries - 1
A
- Describe the data distribution with numerical values
6
Q
Measures of center - 4 (DEF + LIST)
A
- Value which stands at the center or middle of the data set
- Mean
- Median
- Mode
7
Q
Mean - 4
A
- Mean is average, computed by sum of all elements divided by number of elements.
- Mean is not robust, it is affected from outliers
- Sample mean is x bar
- Population mean is μ (Mu)
8
Q
Median - 2
A
- Middle value of the data set after sorting
- Is robust, not affected from outliers
9
Q
Mode - 4
A
- Value that occurs with highest frequency
- Mostly used for nominal data, not as much for numerical
- The number of modes defines how modal a data set is:
- 1 = Unimodal
- 2 = Bimodal
- 3 + = Multimodal
- When a graph has multiple peaks it follows the same logic
10
Q
Measures of variation - 3
A
- Standard deviation
- Variance
- Range
11
Q
Variance - 3
A
- Is the average quadratic deviation from the average
- The sample variance is s^2
- The population variance is σ^2 (Sigma squared)
12
Q
Standard deviation - 4
A
- Measures how much the values deviate from the sample mean.
- It is the square root of the variance
- The sample standard deviation is s
- The population standard deviation is σ (Sigma)
13
Q
Range - 2
A
- Maximum - minimum
- It is very sensitive to extreme values because it uses only two values
14
Q
Percentiles and Quartiles - 5 (DEF + LIST)
A
- Percentile Pi indicates that i% of data is smaller than Pi and (100 - i)% is larger than Pi
- Quartiles divide data set in four groups, which approximately have 25% of values
- Q1 = P25
- Q2 = P50 = Median
- Q3 = P75
15
Q
5 Number Summary - 6 (DEF + LIST)
A
- Graphical representation is boxplot
- Minimum
- First Quartile
- Median
- Third Quartile
- Maximum