Chapter 2 Flashcards
Frequency distribution
Shows how data set is partitioned among all of the several categories (or classes) by listing all of the categories along with the number of data values in each of the categories.
Advantages of using means
- means are relatively reliable
- means don’t vary much compared to other measures of centres
- takes every data value into account
Disadvantages of means
- means are sensitive to every data value, one extreme value can affect the mean dramatically (not a resistant measure of centre)
If a sample of distances has a mean of 24 meters and a median of 24.5 meters and it is found that one observation was wrongly recorded as 30 when the actual value was 35 what would be the effect on the mean and median?
Median remains the same but the mean increases
Symmetric data
Distribution of data is symmetrical if the left and right sides of the histogram are roughly mirror images of each other
Skewed to the right
Also called positively skewed
Long right tail
Mean>median
Skewed to the left
Also called negatively skewed
Longer left tail
Mean<median
Difference b/w variance and standard deviation
Standard deviation is the square root of variance
Standard deviation characteristics
- measure of variation from the mean
- non-negative
- one or more outliers will increase standard deviation dramatically
- units of standard deviation are same as original data collected
What is coefficient of variation
Measures the size of the standard deviation relative to the size of the mean
What is coefficient of variation used for?
- compare the relative variability of values about the mean
- compare the relative variability of populations or samples with different means and different standard deviations
- compare variability in two data sets with may be in different units as coefficient of variation is unit free
Interval of mean +/- 1, 2 and 3 standard deviation is what percentage
1= 68%
2= 95%
3= 99.7%