Lesson0.2: Statistical Analysis Flashcards
Descriptive statistics _________ data. It does not seek __________ within it
Describe, relationships
What are descriptive statistics used for?
Measures of central tendency and measures of dispersion
What do measures of central tendency do?
Estimate the center position of values in a data set
What do measures of dispersion do?
Describe how spread out the values of the data are
What is discreet data?
Numerical data restricted to certain (usually integer) values
Example: rolling a die, can only yield 1, 2, 3, 4, 5, or 6. You can’t get a 5.6
What is continuous data?
Numerical data not restricted to certain number values
Example: the mass of a person can be 63kg, 62.6kg, 62.6523782 kg
What is a uniform distribution?
A type of continuous probability distribution where all probabilities are equal
Example: date/time of birth
What is a normal distribution?
A type of continuous probability distribution with a bell curve shape
Example: heights of adult Canadian females
All normal distributions have the same properties. Name the 3 properties
1) They have a bell shape and are symmetrical
2) The mean is in the center of the distribution
3) The area under the curve is 1
The Y axis in a continuous probability distribution is the …
Frequency
The X axis in a continuous probability distribution is the …
Variable of interest (e.g., mass)
What is an advantage and disadvantage of using the mean?
Pro: it takes all values into account and can thus help minimize error
Con: it takes into account outliers, which can dramatically skew the mean
What does x̄ represent?
Sample mean
What does µ represent?
Population mean
What is the median?
The middle value of an ordered set; the 50th percentile
In what type of data set are the mean and median the same?
In a symmetric distribution
Which measure(s) of central tendency can be used with nominal data sets?
Mode
Which measure(s) of central tendency can be used with ordinal data sets?
Mode, median
Which measure(s) of central tendency can be used with interval data sets?
Mode, median, mean
Which measure(s) of central tendency can be used with ratio data sets?
Mode, median, mean
What is the most appropriate measure of central tendency for interval or ratio data that are skewed or contain outliers?
Median
What is the most appropriate measure of central tendency for non-skewed data?
Mean