Statistics at Square One Flashcards
Summary Statistics:
Median and IQ range are examples
Median = measure of location
(if smallest number smaller, and largest number larger, this would not affect the median)
minimises SUM OF ABSOLUTE DIFFERENCES from a point
Summary Statistics (2)
Mean: disadvantage is it is sensitive to outliers
- minimises SUM OF SQUARES of observations around a point
- SUM of DIFFERENCES with mean (including negative values) will ALWAYS BE ZERO
SD - average spread of observations around the mean
when population from which the data is derived, is approximately NORMAL (GAUSSIAN) distribution, then SD provides useful basis for interpreting data according to probability
many biological characteristics closely conform to normal distribution
a range covered 2 SD above and below the mean includes 95% of observations
STANDARD DEVIATION is a SUMMARY MEASURE
Clues that distribution is not symmetrical but skewed (common in discrete quantitative variables)
- median and mean VERY different
Transformation will sometimes convert skewed distribution into normal e.g.
- data on counts (e.g. number of doctor visits) - square root transformation
- logarithmic - if no negative or zero values
Summary Statistics for binary data [1]:
Two ways of summarising:
- PROPORTIONS
- ODDs
[2]
proportion which is common in medicine = PREVALENCE
A special type of ratio