Descriptive Statistics Flashcards
What is the purpose of measuring central tendency?
It tells us where the middle of a bunch of data lies
What is “typical” about data
What are mean, median, and mode?
- Mean is the average
- median is the middle value
- mode is the most often occurring number.
What is the purpose of measuring variability?
Determine how far from the average the data is.
What are range, variance, and standard deviation?
- Range is the difference between the largest and the smallest number
- variance is variability is how numbers are distributed across the set of data
- standard deviation is average distance of each score from the mean score
What is skew and what causes it?
Skew is the amount of distribution shifted to the right or left of the histogram. uh. It is caused by shitty data. [more precisely, a few extremely high or low scores that greatly affect the mean]
What is kurtosis and what causes it?
The degree to which scores are clustered together or spread out. It is caused by something..[More precisely, kurtosis is caused by too much or not enough variation in the scores].
What is positive skew?
A few high scores make the mean unrealistically high.
What is negative skew?
A few very low scores make the mean unrealistically low.
What is a platykurtosis?
A histogram that looks flat.
The scores are very spread out.
What is leptokurtosis?
The histogram looks peaked.
Scores are too tightly clustered together in the center.
If you line up scores from lowest to highest, what is the score in the middle?
The Median
What statistic adds together all of the scores and divides by the number of scores?
What is the most frequently occuring score in the data called?
The Mode
Which measure of central tendency is the most sensitive to the influence of extreme scores (really high or low)?
The Mean
Extreme scores (outliers) can throw it way off what is really typical in the data.
What measure of central tendency should be used if you have skewed data?
The Median
It’s somewhere between the mean and the mode, which makes it the safest bet. It will be the most representative of what’s truly typical in the data (ex: house prices in a neighborhood, salaries in an organization, exam scores in college)