Describing Data Flashcards
1
Q
sample mean (3)
A
- sum of all observations in a sample divided by n, the number of observations
- varies, depending on the composition of the sample
- symbol: Y bar
2
Q
standard deviation (4)
A
- common measure of the spread of a distribution and indicates how far the different measurements typically are from the mean
- 67% of the data lies within Y bar +/- s and 95% of data lies within Y bar +/- 2s
- the square root of the variance
- symbol: s
3
Q
variance (2)
A
- a measure of spread of data from the mean
- symbol (sample): s^2
- symbol (population): σ (sigma)
4
Q
deviation (2)
A
- the difference between a measurement and the mean
- formula: Yi - Y bar
5
Q
sum of squares
A
- summation in the numerator of the variance formula
6
Q
what is the general rule for rounding answers?
A
- round descriptive statistics to one decimal place more than the measurements themselves
7
Q
coefficient of variation (4)
A
- standard deviation expressed as a percentage of the mean
- CV = (s/Y bar) x 100%
- higher CV means more variability and lower CV means individuals are more similar relative to the mean
- only applicable if all measurements are > or = to 0
8
Q
how do you calculate mean from a frequency table? (3)
A
- calculate the sample size by adding all the frequencies
- multiply each value by their frequency before adding them together
- divide #2 by #1
9
Q
how do you calculate standard deviation from a frequency table?
A
- when subtracting the mean from each value, multiply it by its frequency
10
Q
median (2)
A
- middle measurement of a set of observations
- median = Y([n+1]/2)
11
Q
interquartile range (2)
A
- difference between the third and first quartiles of the data and spans the middle 50% of the data
- IQR = third quartile - first quartile
12
Q
first quartile
A
- middle value of the measurements lying below the median
13
Q
second quartile
A
- median
14
Q
third quartile
A
- middle value of the measurements larger than the median
15
Q
box plot (4)
A
- displays median and interquartile range
- lower and upper edges of the box are the first and third quartiles, thus the IQR is visualized by the span of the box
- lines extend vertically from the box to represent the “non-extreme” values in the data
- “extreme” values are represented as isolated dots