data analysis :descriptive statistics Flashcards
define descriptive statistics
the use of graphs , tables and summary statistics to identify and analyse sets of data
define measures of central tendency
the general term of any measure of the average value in a set of data
what are the 3 types of measures of central tendency ?
mean , median and mode
how is the mean calculated ?
calculated by adding up all the scores or values in a data set and dividing this figure by the total number of scores there are
work out the mean for the following data set of scores :
5 , 7 , 7, 9 , 10 , 11 , 12 , 14 , 15 ,17
total is 107
107/10 the number of scores
mean value of 10.7
why is the mean the most sensitive of the measures of central tendency and what does that mean ?
as it includes all of the scores/values in the data set within the calculation
–> this means it is more representative of the data as a whole
what is a limitation for mean and give an example of how this could happen ?
it is easily distorted by extreme values
example : if we replace 17 in the data above with the number 98 –> the mean becomes 18.8 which doesn’t really seem to represent the data overall
define mean
the arithmetic average calculated by adding up
all the values in a set of data and dividing by the number
of values there are
define median
the central value in a set of data when values
are arranged from lowest to highest
how is the median calculated ?
the middle value in a data set when scores are arranged from lowest to highest
when is the median easily identified ?
in the odd number of scores
how is the median identified in the even number of scores ?
the median is halfway between the 2 middle scores
what is a strengths for the median ?
- the extreme scores does not effect the result
- it is so easy to calculate
what is a limitation for the median ?
it is less sensitive than the mean as not all scores are included in the final calculation
what does ‘sensitive’ refer to ?
refers to how easily a measure is influenced by data that’s unusual or doesn’t fit the rest
define mode
the most frequently occurring value in a set of
data
how to calculate the mode ?
the most frequently occurring score/value within a data set
in some sets of data what may there be ?
2 modes - bimodal
OR
no mode if all the scores are different
what is limitation of the mode and give an example ?
it is a very crude measure
the mode can
- sometimes be quite different from the mean and the median
when would mode be the only method that can be used ?
example :
if you asked your class to list their favourite dessert , the only way to identify the most ‘typical’ or average would be to select the modal group
when deciding what method of central tendency should be used what should be considered ?
whether there are any extreme scores
what would be best to consider if there is no extreme values and why ?
the mean is the best option as it is
the most sensitive measure of the three
what would be best to consider if there is extreme values and why ?
the median is most suitable as the mean
would become distorted
what would be never the best option and when would it be appropriate ?
the mode - except if the data are in categories