data analysis part 1 descriptive statistics Flashcards
what are descriptive statistics
analyses of quantitative data that summarise patterns saving readers from reading a lot of data to understand findings of research
what are measures of central tendency
examples of descriptive data statistics that depict an overall central trend of a set of data: mean median and mode
what is the mean
the sum of all numbers in the data set divided by how many numbers there are in the data set, it takes all numbers of a data set into account
strength of mean - calculating
all raw data points are used in calculating the mean so it is the most sensitive measure of central tendency
weakness of mean - distortion
due to its sensitivity the mean is distorted by extremely high or low values (outliers)
what is the median
the middle score when the data is in numerical order, from this perspective the median may be a better descriptive statistic to report as it yields a value that is unaffected by extreme values
median strength - central
central value so not affected by outliers. easy to calculate
median weakness - sensitivity
does not include all values in the calculation. not sensitive. if there are all even no. data point, typical value is not a recorded value
what is the mode
the most frequently occurring number in a data set, the mode can be useful by showing the most frequent value in a data set but it is of little use where the data set include many different values of the same freq
strength - mode - discrete
the mode is good for discrete and categorical data as it is better to say a whole number instead of 1.89 eg as an average, it is also not distorted by outliers
weakness mode - no and sensitivty
no mode if all values are different or multiple modes
not senstive due to not including all numbers in calc
what are measures of dispersion
it describes the spread of data around a central value, it tells us how much variability there is in the data. there are two: range and standard deviation
what is the range
where you subtract the lowest score from the highest score
range - strength - time
quick and easy to calculate
range - weakness - extreme
only takes into account the most extreme values and therefore is not representative of the data set as a whole
range - weakness - indication
does not indicate whether most numbers are clustered around the mean or evenly spread out
what is standard deviation
it calculates the spread of scores around the mean
strength - SD - precision
more precise measure of dispersion because it calculates all values
weakness - SD - calculation
can be distorted by extreme values and it is more difficult to calculate
concrete example of SD
high SD suggests not all participants are affected by the iv in the same way. low SD is where data is tightly clustered around the mean which means most participants responded in the same way