data analysis :descriptive statistics Flashcards
define descriptive statistics
the use of graphs , tables and summary statistics to identify and analyse sets of data
define measures of central tendency
the general term of any measure of the average value in a set of data
what are the 3 types of measures of central tendency ?
mean , median and mode
how is the mean calculated ?
calculated by adding up all the scores or values in a data set and dividing this figure by the total number of scores there are
work out the mean for the following data set of scores :
5 , 7 , 7, 9 , 10 , 11 , 12 , 14 , 15 ,17
total is 107
107/10 the number of scores
mean value of 10.7
why is the mean the most sensitive of the measures of central tendency and what does that mean ?
as it includes all of the scores/values in the data set within the calculation
–> this means it is more representative of the data as a whole
what is a limitation for mean and give an example of how this could happen ?
it is easily distorted by extreme values
example : if we replace 17 in the data above with the number 98 –> the mean becomes 18.8 which doesn’t really seem to represent the data overall
define mean
the arithmetic average calculated by adding up
all the values in a set of data and dividing by the number
of values there are
define median
the central value in a set of data when values
are arranged from lowest to highest
how is the median calculated ?
the middle value in a data set when scores are arranged from lowest to highest
when is the median easily identified ?
in the odd number of scores
how is the median identified in the even number of scores ?
the median is halfway between the 2 middle scores
what is a strengths for the median ?
- the extreme scores does not effect the result
- it is so easy to calculate
what is a limitation for the median ?
it is less sensitive than the mean as not all scores are included in the final calculation
what does ‘sensitive’ refer to ?
refers to how easily a measure is influenced by data that’s unusual or doesn’t fit the rest
define mode
the most frequently occurring value in a set of
data
how to calculate the mode ?
the most frequently occurring score/value within a data set
in some sets of data what may there be ?
2 modes - bimodal
OR
no mode if all the scores are different
what is limitation of the mode and give an example ?
it is a very crude measure
the mode can
- sometimes be quite different from the mean and the median
when would mode be the only method that can be used ?
example :
if you asked your class to list their favourite dessert , the only way to identify the most ‘typical’ or average would be to select the modal group
when deciding what method of central tendency should be used what should be considered ?
whether there are any extreme scores
what would be best to consider if there is no extreme values and why ?
the mean is the best option as it is
the most sensitive measure of the three
what would be best to consider if there is extreme values and why ?
the median is most suitable as the mean
would become distorted
what would be never the best option and when would it be appropriate ?
the mode - except if the data are in categories
define measures of dispersion
the general term for any
measure of the spread or variation in a set of scores
what are the 2 examples of measures of dispersion ?
range and standard deviation
define the range
a simple calculation of the dispersion in a
set of scores which is worked out by subtracting the
lowest score from the highest score and adding 1 as a
mathematical correction
how is the range calculated ?
a simple calculation of the spread of scores and is worked out by taking the
lowest value from the highest value and (usually) adding 1
why is 1 added when calculating the range and give an example ?
t allows for the fact that raw scores are often
rounded up (or down) when they are recorded within research
- someone
may complete a simple task in 45
seconds
- however, it is unlikely they took exactly 45 seconds to complete this task (in fact
it may have taken them anywhere between 44.5 and 45.5 seconds), so the addition of 1
accounts for this margin of error
what is a strength for the range ?
it is easy to calculate
what s a limitation for the range and give an example ?
- it only takes into account for the 2 extreme values
–> this may be unrepresentative of the data set as whole
EXAMPLE :
a student was ill during a test and score 0 and the highest value was a 100 due to the student being given the paper as a homework - this illustrates the problem with range and it may not give a fair representation of the general spread of scores as in this example most students achieved around half marks in the test
define standard deviation
a sophisticated measure of
dispersion in a set of scores. It tells us how much
scores deviate from the mean by calculating the
difference between the mean and each score. All the
differences are added up and divided by the number
of scores. This gives the variance. The standard
deviation is the square root of the variance
what is the SD considered as ?
much more sophisticated measure of dispersion
what does the SD tell us ?
is a single value that tells us how far scores deviate (move away from) the mean
the larger the SD
the greater the dispersion or spread within a set
of data
in a situation where we are talking about a particular condition within an experiment a large SD suggests that ..
that not all participants were affected by the IV in the same
way because the data are quite widely spread
- may be few anomalous
results
what may a low SD value reflect ?
the fact that the data are tightly clustered
around the mean,
–> might imply that all participants responded in a fairly similar way
what is a strength of SD ?
much more precise measure of dispersion than the range as
it includes all values within the final calculation
what is a limitation of SD ?
like the mean –
it can be distorted by a single extreme value