Descriptive Statistics Flashcards
Define measures of central tendency?
These inform us about the centre of the data
What are the 3 measures of central tendency?
Mean, median, mode
How do you calculate the mean?
What are the advantages of using the mean?
It is a ver sensitive statistic because it takes into account the exact value of all the data
It is representative of all of the data
What are the disadvantages of using the mean?
If one of extremely high or low (an anomaly) then the mean can be distorted and therefore misrepresent the data
It can be used with nominal data
How do you calculate the median?
The middle value in an ordered list. All data must be ordered numerically in a list. If there is an even number of data then you get the 2 central items add them and then divide by 2 to get the median
What are the advantages of using the median?
It isn’t effected by extreme scores so it can be useful under such circumstances
It is easy to calculate
It isn’t distorted by any anomalies
What are the disadvantages of using the median?
It doesn’t reflect the whole data set
It is less sensitive than the mean because exact values may not always be used
How do you calculate then mode?
The most frequent value
You can have data sets with bi or tri modal data sets
What are the advantages of using the mode?
It is unaffected by anomalies
It is useful for discrete data
The only measure of central tendency that can be used for nominal data
It’s output is a piece of data, you can’t have 2.4 children
What are then disadvantages of using the mode?
Sometimes there maybe too many modes that using it becomes meaningless
It doesn’t represent the full data set
Define measures of dispersion?
A type of descriptive statistic that finds out how spread out the data items are
What are the 2 measures of dispersion?
Range
Standard deviation
Define the range?
the arithmetic distance between the top and the bottom values in a set of data
What are the advantages of using the range?
It is easy to calculate
Even if 2 sets of data have the same mean they could have different ranges so the range can be used well to describe the data