Applied Economics & Statistics 2: Describing Data—Measures of Central Tendency and Measures of Dispersion* Flashcards
1
Q
What are ‘Measures of Central Tendency’ also known as?
A
Measures of Location
2
Q
List the ‘Measures of Central Tendency’
A
- Mean
- Median
- Mode
3
Q
What do ‘measures of central tendency’ and ‘measures of deispersion’ do?
A
They summarize a set of data into
simple, but meaningful, statistics
4
Q
Define & explain ‘mode’
A
- The most frequently occurring value in the data
- Only measure of central
tendency for nominal data.
E.g. heterosexuality is the
modal sexual orientation - When is the Mode Useful?
1. Determine where there is a clustering of values.
2. Clothes shop might want to know modal dress size. - Advantage of the Mode: not affected by extreme high or low
values. - Disadvantages of the Mode:
1. Non-modal values are given zero weight.
2. Some distributions have no mode.
3. Some distributions have more than one mode!
4. Modal value may not be central at all!
5
Q
In statistics, what’s ‘the most frequently occurring value in the data’?
A
Mode
6
Q
Define and explain ‘median’
A
- Median - The midpoint (middle score) of the values after they have been ordered
from the minimum to the maximum. - There should be as many values above the median as below the median.
- Odd number of observations: if there are an odd number of values in
the dataset then the median is the middle value. - Even number of observations: if there is an even number of values in
the dataset then the median is the average of the two middle numbers - Properties of the Median:
1. There is a unique median for every set of data.
2. It is not affected by extremely large or small values (outliers).
3. It can be computed for ratio, interval, and ordinal data.
7
Q
Describe & explain ‘mean’
A
- Most commonly used measure of central tendency. It is the arithmetic
average of a set of data - When we consider the whole population and we have raw (ungrouped)
data: Population Mean - μ = ∑xi / n
Example - Q: Calculate the mean of the following set of raw values: 12, 54, 8, 10, 31, 8, 12, 57, 8.
A μ = ∑ xi / n = (12 + 54 + 8 + 10 + 31 + 8 + 12 + 57 + 8) / 9 = 22.22. - When we consider the whole population, and we have grouped data (in
the form of a frequency table): Weighted Population Mean - μ =∑ xi*fi / n , *where fi is the frequency. - When we consider only a sample of the total population, we designate the
sample mean by ̄x rather than μ, but it is obtained in the same way. - Example 1:
Q: Calculate the mean of the following set of raw values:
12, 54, 8, 10, 31, 8, 12, 57, 8.
μ =∑ xi / n = 12 + 54 + 8 + 10 + 31 + 8 + 12 + 57 + 8
9 = 22.22 - Properties of the Mean:
1. To compute a mean, the data must be measured at the interval or ratio level.
2. All the values are included in computing the mean.
3. The mean is unique.
4. The sum of the deviations of each value from the mean is zero.
∑(xi − ̄x ) = 0.
5. Unlike the median, the mean is always affected by unusually large or
small data values (extreme scores or outliers)
8
Q
When it comes to a raw set of data vs a frequency table, is the mean the same?
A
Yes
9
Q
Describe the 3 common shapes of data
A
- Mean = Median = Mode: symmetrical distribution.
- Mean > Median > Mode: positive skew (skewed to the right, positive direction)
- Mean < Median < Mode: negative skew (skewed to the left, negative direction)
10
Q
Generally, in which scenario do you use each measure of central tendency?
A
- Use mode when the example is interested in knowing the most common or mos tpopular data point
- Use median when there are few or no modal values and the example is looking for the average but the data set has a lot of outliers
- Use mean when the example is looking for the average and the mean would not be heavily skewed as there are few to no outliers present
11
Q
A