Summarising data Flashcards
What are the 2 types of summary descriptive statistics?
- Measure of central tendency = avg
2. Measure of dispersion = spread of scores
What are the different ways to measure central tendency/ typical performance?
- Mean
- Mode
- Median
What are the dis and ad of using mode?
Mode: most frq score in a set of score
AD:
- simple + easy
- only avg which can be used w/ nominal data (categorical)
DIS:
- can be unrepresentative therefore misleading
–> 39 = mode but best of numbers may be low
- may be more than one mode in a set of score
What are the dis + ad of using mean?
Mean: (add all scores)/ total numbers of scores
AD:
- uses info from every single score
- resistant to sample fluctuation
DIS:
- Susceptible to distortion from extreme score - outliers + Skew
What are the dis + ad of using median?
Median: arrange scores in order, median = middle value or avg of middle 2 scores
AD:
- resistant to the distorting effects of extreme high or low scores
DIS:
- ignores score’ numerical value = wasteful data
- more susceptible to sampling fluctuations than the mean
What are the different measures of dispersion/ variability in performance?
- Range
2. Standard deviation
What are the AD and DIS of the range?
Range = difference between the highest + lowest score
AD;
- quick + easy to calculate
DIS:
- influenced by extreme scores
- conveys no info about the spread of scores between the highest + lowest scores
–> could have same range but spread of data completely different
What is SD?
- The spread of scores around a sample mean
- tells us how well the mean summarises the sample
- -> bigger the SD, the more scores differ from the mean + between themselves and less satisfactory the mean becomes as a summary of data
What are the ads + dis of SD?
AD;
- like the mean, use info from every score
DIS:
- not intuitively easy to understand
How do you calculate SD?
- Work out mean of data
- Subtract mean from each score
- Square the differences obtained
- Add up the squared differences = SS sum of squares
- SS/ the total number of scores = Variance
- SD = square root of variance
What are some issues with using the mean and SD?
- usually obtain SD/ mean from a sample = cannot extrapolate to the population from our sample = only a good estimate
- SD tends to underestimate the population SD
How can we deal with SD typically underestimating the SD of the population?
- when using sample, divide by n
- when using the sample SD as an ESTIMATE of the population, divide by n-1
(makes SD larger)
What is the relationship between the normal curve and the SD?
- The SD cuts off a constant proportion of the distribution of score
> 68% of ppl have IQs between 85 + 115 (mean = 100, SD +/- 15)
> 95% have IQs between 70 - 130 (100, +/- (2*15))
What are the chances of 99.7% of a population will have an IQ between 55 - 145 if the mean = 100 and the SD = 15 and the SD constant has been 30?
- (100 - 99.7)/ 2
–> 2 there since it is 2 more than normal 1 SD
= only occurs in 15% of the population
Wha is standard error of mean?
A type of SD
- is the SD of a set of sample means
- shows how much variation there is within a set of sample means
- -> indicates the reliability of each sample mean as an estimate of the true population mean