Measures of Central Tendency. Flashcards
Once we have collected our data, what do we need to do with it?
It needs to be summarised and analysed.
What must the summary of our data be?
- Fair
- Useful
- Not Misleading
When we summarise data we must present it with the least amount of _____.
Ambiguity.
What is the trouble with summarising data?
The act of summarising inevitably results in distortions.
Clearly Summarising Data consists of 2 things, name them.
- Measures of Central Tendency
2. Measures of Dispersion.
The measure of central tendency is where we refer to the _____ value in a data ____ in some way.
Central Value, Data Set.
What measure does this refer to?
-To what extent do the values in a data set tend to vary around the central or typical value.
The Measure of Dispersion.
Name the 3 measures of Central Tendency.
- Mean
- Mode
- Median.
What is the mean also known as?
The average.
How do we calculate the mean (average)?
- Add up all the values in the data set
- Then divide by the number of values in the data set.
We calculate the mean to how many decimal points?
2 :)
Sum of all the scores
__________________ = the _____.
Number of scores
Mean.
What is the symbol for the mean?
X bar = x̄
What is the notation for the formula to calculate the mean?
x̄ =∑x
_____
N
∑x
What does this symbol mean?
Add up all the values in the data set.
N
What does this mean?
(Divide by) The total number of values .
What are the Pros of the mean?
- Powerful Statistic which is used in estimating population parameters
- Most Sensitive
- Most Accurate
Why is the mean the most sensitive and accurate?
Because it works at an interval level of measurement.
What are the cons of the mean?
- You can get funny numbers eg. 2.4 children
- Sensitive therefore easily distorted (by an outlier).
In a data set, what can the mean be distorted by?
An outlier.
What is an outlier?
A value that is lots higher or lower than the other data which distorts the mean.
What is meant when we say that the mean is distorted?
It is not representative of the data set.
What measure of central tendency gets around the main disadvantage of the mean, regarding the effect of extreme values?
The Median.
What is the median?
The central (middle) value in a data set.
The median is easy to find in ___ numbered data sets.
Odd.
In order to find the median, what must we do first?
Put the data set in numerical order.
The bigger the data set, the harder it becomes to find the ______. It is more _____ consuming. What is used the help find the median position for big data sets?
Median.
Time.
A formula.
What is the formula for finding the median position (k)?
K= N + 1
________
2
When we calculate the median we need to have our data in ________ ___________.
Numerical Order.
If there is an even number of numbers in the data set, where will the median position be?
It will be midway between 2 numbers.
True or False?
Regardless of odd or even numbered data set we use the same formula to get the median position.
True.
In an even numbered data set, once we have found the data position what do we have to do?
Find the two numbers that the median position is between and then fins the average of these 2 numbers.
If the median position is 4.5, where will the median position be?
Midway between 4 and 5.
What is a Pro of The Median?
It is unaffected by extreme values in one direction, therefore it is better for use with “skewed distributions”.
Normally distributed data is best suited to the _____ and is in a classic ____- _____ curve.
Mean.
Bell-shaped curve.
What measure of central tendency is best for data with a skewed distribution?
The median.
What are the cons of the median?
- Doesn’t take into account exact distances between values
- We can’t use this measure in estimates of population parameters
- Can be unrepresentative in small data sets.
Aside from the mean and median, name the third measure of central tendency.
The mode.
We can’t calculate an average or a median with ____ data. So what measure of central tendency is often used?
Nominal. The mode.
What is the mode?
The value that occurs most often/ is the most frequent :)
The mode is the most _____ occurring _______.
Frequently, Category.
The mode is sometimes referred to as what?
The modal value.
1, 2, 3, 4, 4, 4, 4, 5.
What is the mode?
4 :)
1, 2 ,3 ,4, 5.
What is the mode?
There is no (single) mode- They are all equally frequent.
If there are 2 modes in a data set, what is this known as?
Bimodal.
Nominal values are ________.
Categories.
What is the typical measure of central tendency for nominal data?
The Mode.
The ____ can also help avoid instances such as 2.4 children in ______ data at ______ levels of measurement.
Mode, discrete, higher.
What are the pros of the mode?
- Shows the most frequent, or typical value in a data set
- Unaffected by extreme values in one direction
- Can sometimes be informative when scale is discrete (2.4 children).
What are the cons of the mode?
- Doesn’t take into account the exact distances between values.
- Can’t be used in estimates of population parameters
- Not really useful for small datasets
- Bimodal distributions can occur.
For what scales can the mean be used for?
Interval or Ratio.
For what scales can the median be used for?
Ordinal, (interval or ratio if more appropriate than mean).
For what scales can the mode be used for?
Nominal (Ordinal, interval, ratio if more appropriate than other measures).