Data-based and statistical reasoning Flashcards
Measures if central tendency
provide a single value representation for the middle of a group data
arithmetic mean or average
measure of central tendency that equally weighs all values; it is most affected by outliers
median
the value that lies in the middle of the data set. Fifty percent of data points are above and below the median
mode
the data point that appears most often; there may be multiple (or zero) modes in a data set
Normal distribution
is symmetrical. The mean, median, and mode are all the same in the normal distribution.
standard distribution
is normal distribution with a mean of zero and a standard deviation of one; it is used for most calculations. 68% of data points occur within one standard deviation of the mean, 95 % within tow, and 99% within three.
Skewed distribution
have differences in their mean, median, and mode; the skew direction is the direction of the tail of the distribution.
Bimodal distributions
have multiple peaks, although not necessarily multiple modes, strictly speaking. It may be useful to preform data analysis on the two groups separately.
Range
is the difference between the largest and smallest values in a data set.
Interquartile range
the difference between the value of the third quartile and first quartile; interquartile range can be used to determine outliers.
Standard deviation
a measurement of variability about the mean; standard deviation can also be used to determine outlier
outliers
may be a result of true population variability, measurement error, or a non-normal distribution
independent events
the probability does not change based on outcomes of other events
dependent events
changes depending on the outcomes of other evens
mutually exclusive outcomes
cannot occur simultaneously