7 Measures of Central Tendency and Dispersion Flashcards
What is a distribution in statistics?
A function that describes the probabilities of any theoretical outcome based on evidence from a study.
Distributions help in understanding how likely a specific value is when compared to historical data.
What is the purpose of creating a histogram?
To visualize the distribution of data and understand its shape.
A histogram allows analysts to see the frequency of different values in a dataset.
What is a normal distribution?
A bell-shaped distribution where the mean, median, and mode are all at the center, and probabilities decrease evenly as you move away from the center.
It is one of the most common distributions in statistics.
What does a uniform distribution look like?
A flat line where every value has the same probability of occurring.
In a uniform distribution, there is no variability among the data points.
What is a Poisson distribution?
A distribution that describes the number of times an event occurs in a fixed interval of time or space.
Examples include counting occurrences of events like website hits or roadkill.
What is the main characteristic of an exponential distribution?
It describes the time between events in a Poisson process, typically showing a sharp curve.
Exponential distributions can indicate rapid changes in a variable.
What are Bernoulli distributions?
Distributions with only two possible outcomes, such as true/false or success/failure.
They are used to model binary variables.
What is the key difference between Bernoulli and binomial distributions?
Bernoulli distributions have one trial, while binomial distributions involve multiple trials.
Binomial distributions calculate the probability of a specific number of successes across several trials.
What does skew measure in a distribution?
The degree of asymmetry of a distribution around its mean.
A distribution can be negatively skewed (left) or positively skewed (right).
What is kurtosis?
A measure of the ‘tailedness’ of a distribution, indicating how much of the variance is due to extreme values.
Distributions can be leptokurtic (tall and skinny) or platykurtic (short and wide).
What are the three measures of central tendency?
- Mean
- Median
- Mode
These measures summarize a dataset with a single value representing the center.
How is the mean calculated?
By summing all values and dividing by the number of values.
This represents the average of the dataset.
What is the median in a dataset?
The middle value when the data is ordered from least to greatest.
If there is an even number of values, the median is the average of the two middle numbers.
What is the mode?
The value that appears most frequently in a dataset.
A dataset can have more than one mode or no mode at all.
What is the mean?
The mean is also called the average. To find the average, you take the sum of the values and divide them by the number of values.
What are the steps to calculate the mean?
- Find the sum of the values.
- Divide the sum by the number of values.
Calculate the mean of the weights: 22 lb, 26 lb, 24 lb.
Mean = 24 lb
Sum = 22 + 26 + 24 = 72; Mean = 72 / 3 = 24
What is the median?
The median is the middle value of a dataset when arranged in ascending order.
What are the steps to find the median?
- Arrange values in ascending or descending order.
- Find the value that is in the exact middle.
- If the dataset has an even number of values, find the mean of the two middle values.
Find the median of the weights: 22 lb, 24 lb, 26 lb.
Median = 24 lb
Middle value of 22, 24, 26 is 24.
Find the median of the weights: 22 lb, 24 lb, 25 lb, 26 lb.
Median = 24.5 lb
Middle values are 24 and 25; (24 + 25) / 2 = 24.5.
What is the mode?
The mode is the number that occurs most often in a dataset.
What are the steps to find the mode?
- Arrange values in ascending or descending order.
- Count the occurrences of all repeating values.
- Compare the number of occurrences for every repeated value.
What is the mode of the weights: 22 lb, 25 lb, 25 lb, 24 lb, 24 lb, 23 lb, 25 lb, 22 lb?
Mode = 25 lb
25 occurs most frequently.
True or False: The mode can have no value if no number is repeated.
True.
When should you use the mean?
Use the mean as a default; it works well with normal distributions.
When is the median preferred over the mean?
Use the median when dealing with skewed or asymmetrical data.
When is the mode useful?
The mode is ideal for handling nominal variables.
What is range?
Range is the difference between the maximum and minimum values in a dataset.
What are the steps to calculate the range?
- Arrange values in ascending or descending order.
- Identify the minimum and maximum values.
- Subtract the minimum from the maximum.
Calculate the range of the following dataset: 12, 18, 10, 22, 15, 25, 16, 17, 14, 19.
Range = 15
Max = 25, Min = 10; 25 - 10 = 15.
What are quartiles?
Quartiles divide your data into four equal parts.
What are the steps to find quartiles?
- Arrange values in ascending or descending order.
- Find the median (Q2).
- Split the dataset at the median.
- Find the median of the lower dataset (Q1) and the upper dataset (Q3).
Find the quartiles of the dataset: 12, 18, 10, 22, 15, 25, 16, 17, 14, 19.
Q1 = 14, Q2 = 16.5, Q3 = 19
Q2 is the median; Q1 is median of lower half; Q3 is median of upper half.
What is the interquartile range?
The interquartile range is the difference between Q3 and Q1.
Calculate the interquartile range given Q1 = 14 and Q3 = 19.
Interquartile range = 5
19 - 14 = 5.
What is variance?
Variance is a measure of dispersion that looks at the squared deviation of each data point from the mean.
What are the steps to calculate variance?
- Find the mean of the dataset.
- Subtract the mean from each data point.
- Square the results.
- Find the sum of the squared results.
- Divide by the number of data points minus 1 (for samples).
Calculate the variance for the dataset: 3, 6, 4, 7, 5, 1, 9, 5, 4, 6.
Variance = 4.9
Mean = 5; Squared deviations summed = 44; 44 / (10 - 1) = 4.9.
What is standard deviation?
Standard deviation is the square root of the variance.
Why is standard deviation important?
It provides a measure of dispersion in the same units as your dataset.
What happens when you apply standard deviation to a normal distribution?
One standard deviation above the mean includes 34.1% of your data points.
What is the square root of the variance called?
Standard deviation
Why is standard deviation important?
It puts the standard deviation in the same units as your dataset
What percentage of data points are within one standard deviation of the mean in a normal distribution?
68.2%
What percentage of data points are within two standard deviations of the mean in a normal distribution?
95.4%
What percentage of data points are within three standard deviations of the mean in a normal distribution?
99.6%
What is a common cutoff for outliers in data analysis?
Three standard deviations
What is the first step in calculating standard deviation?
Find the mean of the dataset
What do you do after finding the mean in the standard deviation calculation?
Subtract the mean from each data point
What do you do after subtracting the mean from each data point?
Square the results
What is done after squaring the results in the standard deviation calculation?
Find the sum of the squared results
What do you divide the sum of squared results by?
Number of data points minus 1
What do you take the square root of in the standard deviation calculation?
Variance
What is the mean of the dataset: 4, 5, 3, 4, 2, 2, 6, 4, 6, 4?
4
What is the variance of the dataset: 4, 5, 3, 4, 2, 2, 6, 4, 6, 4?
2
What is the standard deviation of the dataset: 4, 5, 3, 4, 2, 2, 6, 4, 6, 4?
1.41
What are common types of distributions?
- Normal
- Uniform
- Poisson
- Exponential
- Bernoulli
- Binomial
What is skew in data distribution?
How data is distorted left or right
What is kurtosis in data distribution?
How data is distorted up or down
What are the common measures of central tendency?
- Mean
- Median
- Mode
What are the simpler methods of measures of dispersion?
- Range
- Quartiles
What are more complicated methods of measures of dispersion?
- Variance
- Standard deviation
What does standard deviation measure?
The average distance of every point from the mean
True or False: The middle quartile (Q2) is the same as the median.
True
What is the mode of the dataset: 24, 18, 36, 51, 24, 48, 18?
- 18
- 24
What is the range of the dataset: 15, 615, 46, 73, 45, 80, 46?
600
What is the middle quartile (Q2) of the dataset: 10, 24, 13, 9, 15, 7, 19?
13
What is the standard deviation of the sample dataset: 9, 11, 7, 8, 9, 10?
√2