data analysis part 1 descriptive statistics Flashcards
what are descriptive statistics
analyses of quantitative data that summarise patterns saving readers from reading a lot of data to understand findings of research
what are measures of central tendency
examples of descriptive data statistics that depict an overall central trend of a set of data: mean median and mode
what is the mean
the sum of all numbers in the data set divided by how many numbers there are in the data set, it takes all numbers of a data set into account
strength of mean - calculating
all raw data points are used in calculating the mean so it is the most sensitive measure of central tendency
weakness of mean - distortion
due to its sensitivity the mean is distorted by extremely high or low values (outliers)
what is the median
the middle score when the data is in numerical order, from this perspective the median may be a better descriptive statistic to report as it yields a value that is unaffected by extreme values
median strength - central
central value so not affected by outliers. easy to calculate
median weakness - sensitivity
does not include all values in the calculation. not sensitive. if there are all even no. data point, typical value is not a recorded value
what is the mode
the most frequently occurring number in a data set, the mode can be useful by showing the most frequent value in a data set but it is of little use where the data set include many different values of the same freq
strength - mode - discrete
the mode is good for discrete and categorical data as it is better to say a whole number instead of 1.89 eg as an average, it is also not distorted by outliers
weakness mode - no and sensitivty
no mode if all values are different or multiple modes
not senstive due to not including all numbers in calc
what are measures of dispersion
it describes the spread of data around a central value, it tells us how much variability there is in the data. there are two: range and standard deviation
what is the range
where you subtract the lowest score from the highest score
range - strength - time
quick and easy to calculate
range - weakness - extreme
only takes into account the most extreme values and therefore is not representative of the data set as a whole