Semester 1 Flashcards
What is important to consider when looking at sample size?
- Size matters
- Sampling error can result if your ample is not large enough
- Trade off between size and time/cost
Factors in deciding sample size?
o Design
o Response rate
o Heterogeneity of population
What is a population parameter?
- a quantity that describes some characteristic of a population with respect to a specific variable
- E.g., population mean, population range etc.
- Not usually possible to calculate
- Might be given to you if available
What is a sample statistic?
– a quantity that describes some characteristic of a sample with respect to a specific variable
- E.g., sample mean, sample range etc.
- We can always calculate these from a sample
- Sample statistics provide an estimate of population parameters
Why is it important to summarise data?
- Data can be very complex and therefore it is useful to summarise it
- Allows for interpretation
What are measures of central tendency?
They provide an indication of a “typical” score in the data set
What is the mean?
o Provides and estimate of the average score in the data set
o Is affected by extreme data points
What is the median?
o Is insensitive to extreme scores in the data set
o Doesn’t reflect the shape of the scores e.g., doesn’t care how far away the extreme scores are
What is the mode?
o Easy to calculate from a histogram and easy to understand – the most common value
o Data might have more than 1 mode or no mode at all
What is the range?
o Difference between min and max scores
o Range doesn’t always change for distributions with different shapes
What is a deviation?
o The signed distance of a score from the mean
How to calculate simple variance?
o Calc mean o Calc deviations o Square deviations o Calc a slightly adjusted average squared deviation - You divide by n-1
What is the danger of bimodal data?
- Danger – mean is not representative
- Tends to suggest an issue with your experiment – more than one underlying population
What is the normal distribution?
- Bell-shaped
- Symmetric about the centre
- Tails never reach 0 – go towards infinity
- The area under the centre is always equal to 1
- Very close to 0 by the time it gets to 3 SD from the mean – can use this to draw a rough idea of a normal distribution
What is probability?
– a measure of how likely it is that an uncertain event will occur
What is conditional probability?
- Probability of an event given that something else is known/assumed e.g., A|B
What is a z-score?
- Z measures how far away your sample is from the population mean in multiples of the SD
- If you were to find z-scores for all points on a normal distribution, you would find that it would form a normal distribution with mean 0 and SD 1 – N (0, 1)
- The area underneath a normal distribution above/below some variable value of x EQUALS the area underneath N (0, 1) above/below z
How do you obtain a z-score?
- Obtained by subtracting the population mean from x and then dividing by the population SD – (x-µ)/σ
What is a sampling error?
Sampling error – the error associated with examining statistics calculated from a sample rather than the population
Why do sampling errors occur?
- It occurs because in our sample we don’t have all the members of the population
What does the magnitude of a sampling error depend on?
The sample size
- Bigger sample = big sampling error less likely
- Smaller sample = big sampling error more likely
How do we generate a sampling distribution?
- Take a sample (size N) from a population
- Calculate a sample statistic (e.g., mean, SD etc.)
- Add the new statistic to a frequency plot (a histogram) of the sample statistic
- Repeated the above 3 steps multiple times
What does the sampling distribution tell us?
- Tells us important info about how a statistic changes from sample to sample
- What is the mean value of the statistic over all samples?
- How variable is the statistic over all samples?
- What shape is the distribution of the statistic over all samples?
What are the properties of the sampling distribution of the mean (SDM)?
- Mean which is the same as the parent population
- SD is different to that of the parent population – find by calculating σ (of p pop)/√N (sample size)
- SD is called the standard error of the mean (s.e.m.) or standard error (s.e.)
- S.e.m. must be smaller than SD of the parent population because you are diving by something that is bigger than one