Sampling Variation, Bias, Confounding Flashcards
What is the sampling distribution?
The distribution of a statistic across an infinite number of samples.
What is the 95% confidence interval?
- Range that is likely to contain the mean of the population values 95% of the time.
- I.e. Range in which you are 95% sure that the mean of the population values lie.
What is the 95% Confidence Interval based on?
Gaussian distribution (of the means of the samples)
The 95% confidence interval is wider the…
… greater the variation in population values.
… smaller the size of the sample used to calculate it.
What is a confidence interval?
Estimate of the precision of the observed values in the sample.
What are the 2 main categories of bias?
- Selection bias
2. Information bias
What are the reasons for selection bias and the impact of these?
- Errors in generalisability (external validity)
- Study Ps are drawn from a sampling frame that is not representative of the general population .
- Errors in comparability (internal validity).
- The groups being compared are not from the same population.
- One of the groups being studies is not representative of the sampling frame from which it was drawn (e.g. Low response rate to a survey, high attrition rate).
What are the different types of information bias?
- Differential recall error (especially in case-control studies)
- e.g. Parents of ill child better recall than parents of unaffected child.
- Differential observer of interviewer error
- Differential measurement error
- Differential mis-classification
What is a confounding variable?
An extraneous variable that correlates with both the dependent and independent variable.
What is the biggest confounding variable and what does this arise from?
- Age
- Demographic differences (i.e. Rapidly expanding, expanding, stationary and contraction).
How can confounding variables be controlled for (e.g. Standardising for effect of age in death rate of 2 pops with different demographics)?
- Adjust with direct standardisation.
- adjust number of people in each age category so that they are = in 2 samples.
- Adjust with indirect standardisation.
- adjust death rate (of each age group) so that it is = in 2 samples.