Bias and Confounding Flashcards
Explain how chance may lead to an apparent association between exposure and outcome that is not true.
With any study, there is always the possibility that the observed association simple arose by chance. This is because our measure of association between exposure and outcome (e.g. odds ratio, mean difference) is based on a sample rather than the entire target population and is subject to sampling error. Therefore it is only an estimate of the true value in the target population. This concept of sampling variation was introduced in section 3, and we have seen how we can compute a confidence interval which tells us where the ‘true’ or ‘real’ measure of effect (e.g. the true odds ratio) is likely to lie. This concept of chance is also explored in section 6 when we will also look at how we can assess the probability that an observed association has arisen by chance using p-values.
Explain how reverse causation may lead to an apparent association between exposure and outcome that is not true.
Reverse causation is the concept by which instead of the exposure causing the outcome, the outcome actually caused the exposure. Often the onset of a disease, even before a diagnosis has been made, may cause individuals to change their behaviour or lifestyle. For example, some cross-sectional and case-controlled studies have shown a positive association between being underweight and having cancer. But rather than being underweight being a cause of cancer, it may be that when people get cancer, the illness leads to them losing weight. To overcome this problem, it is important to ensure the exposure measurements relate to the period before the disease onset, allowing for the fact that there may be a preclinical stage of the disease and that the person may experience symptoms prior to a diagnosis being made. Reverse causation is primarily a problem in cross-sectional and case-control studies, although can occur in cohort studies if the follow-up period is short.
What is confounding?
Confounding occurs when the estimate of an exposure outcome is distorted by some other exposure which is related to both the outcome of interest and exposure of interest. For example if we find that coal miners are more likely to develop lung cancer than people on other occupations, rather than coal dust causing lung cancer, the observed association may actually be due to the fact that miners smoke more than the general population and that smoking is the true cause of the increased risk. In to example coal dust is the exposure, lung cancer is the outcome, and smoking is the possible confounded.
For a variable to be a confounder, it must be associated with the exposure of interest and it must be independently associated with the outcome.
Note, a confounder cannot be a factor which is on the casual pathway. For example, blood pressure would not be considered a confounder in the relationship between exercise (exposure) and heart attack (outcome). This is because blood pressure is actually involved in the mechanism by which exercise works to lower the risk of a heart attack - i.e. blood pressure is on the casual pathway between exercise and heart attack.
How can confounders distort an association?
Usually confounders are investigated to determine whether an apparent association between exposure and disease is still present after allowing for confounding factors. However confounders can work in any direction. Sometimes a confounder actually masks a true relation between exposure and disease, that is, initially there is no apparent relation but after allowing for the confounder an association appears. A confounder can even change the direction of an association completely, that is, an apparent positive association between exposure and disease may become negative when the confounder is allowed for, or vice versa.
What can you do to avoid confounding?
Confounding can be dealt with in the following ways, either at the design stage or analysis stage:
1) . Appropriate study design - Randomisation, Restriction, Matching
2) . Analysis methods - Stratification, Multivariate design
Describe randomisation as an appropriate study design to avoid confounding.
The most effective way to eliminate confounders (known and unknown) is by randomisation - i.e. perform a randomised clinical trial. Randomisation works because people are assigned to be either exposed or unexposed in an entirely random way. This ensures that all potential confounding factors (known and unknown) are distributed equally and therefore cannot distort the true association between exposure and outcome. Breaking one part of the triangle is enough to remove the confounding however. It is for this reason that randomised controlled trials provide the strongest evidence for association between exposure and outcome. However it should be noted that for ethical reasons randomisation is not always possible - for example if our exposure of interest is smoking, we cannot randomise some people to smoke and others not to.
Describe restriction as an appropriate study design to avoid confounding.
Confounding can be avoided at the design stage by restricting the study group to only one level of the confounding variable. For example, in a study of occupation disease previous mining example), one might restrict the study to non-smokers only to remove smoking as a confounding factor. This method of dealing with confounding is called restriction. The down side of restriction is that it limits the number of participants in the study and it also limits how generalisable the findings are.
Describe matching as an appropriate study design to avoid confounding.
In case-controlled studies, cases and controls can be matched to one another according to potential confounding factors. For example, in a case-control study to determine whether coffee drinking causes carcinoma of the pancreas, to remove the potential confounding influence of smoking we find a case of cancer of the pancreas who smokes and compare this to a control (person from the same base population who does not have cancer of the pancreas) who also smokes. It is important to note however that matched case-control studies have to be analysed using a particular statistical method whereby the matching factors are taken into account. If this is not done, the results from the study will be biased.
Describe stratification as an appropriate analysis method to avoid confounding.
More commonly confounding is controlled in the analysis, although this relies on the relevant information on the confounding factors being collected accurately. Stratification is a widely used technique in which you carry out the analysis and compute a measure of effect separately for each level of the potential cofounder. For example suppose you carry out a case-control study to examine the association between coffee consumption and cancer of the pancreas. When the association is initially examined, the odds ratio is estimated to be 1.9. However, we might think that smoking is a potential confounding factor. Therefore we can separate subjects into smokers and non-smokers and see if the association is still there. If we reanalyse the data we may find that it supports the suggestion that smoking confounds the association between coffee consumption and cancer of the pancreas - and that there is actually no effect of coffee on cancer of the pancreas seen for smokers or non-smokers.
Describe randomisation Multivariate analysis as an appropriate study design to avoid confounding.
In practice, researchers often want to allow for the potential confounding effects of many factors, not just one, and more sophisticated techniques have been developed to do this. Again, to use these methods, the information on the potential cofounders needs to have been collected as part of the study design. These techniques effectively ‘adjust’ the estimated measure of effect, e.g. odds ratio, to control for the confounders.
Which factors might be cofounders?
This depends on the exposure and disease under investigation, and it is important to read the literature and identify risk factors for the disease prior to starting the study so that data on them can be collected. The following factors however are important determinants of most diseases and also associated with many exposures, so are a good starting point:
- Age
- Gender
- Socioeconomic status
- Diet
- Ethnicity
- Smoking habit
What is bias?
Bias is a systematic error in an estimate and results in a measure of effects (e.g. odds ratio or mean difference) which can be either above or below the true value, depending on the nature of the systematic error.
What is the difference between bias and confounding?
Bias is a consequence of defect in the design or execution of an epidemiological study. Bias cannot be controlled for in the analysis of a study, whereas you can adjust for confounding if you have measured the confounding variable.
To make clear the difference between bias and confounding, consider two possible interpretations of an epidemiological study which reports finding an association between exposure and disease.
If the association results from bias, this means that the disease is not associated with the exposure in the population under study, the finding is simply wrong.
If the association results from confounding, this means that although the disease is associated with the exposure, some other factor (which is associated with exposure and independently associated with disease) can explain the association.
For example, consider a case-control study which finds an association between coffee drinking and bladder cancer, i.e. coffee drinkers are more likely to have bladder cancer.
If the association arose because of error in the exposure data such that cases with bladder cancer recalled their coffee drinking better than controls, this would be bias.
If smoking is associated with coffee drinking and also associated with an increased risk of bladder cancer, then the observed relation between coffee drinking and bladder cancer could be due to confounding by smoking status.
What are the two main forms of bias in epidemiological studies?
There are two main forms of bias in epidemiological studies:
- selection bias
- information bias
What is selection bias?
Selection bias arises from defects in the study design relating to how people are chosen to be, or end up in the study. One way this can occur is if the method the researchers use to sample or select the study population results in an unrepresentative population. If the criteria for inclusion in the study is related to the exposure of interest, and this differs between the diseased and non-diseased, then the estimate of effect will be biased. This is a particular issue in case-control studies.
For example, in a case-control study of the effect of smoking on lung cancer, if controls are selected from people who were suffering from non-malignant respiratory disease, this will not be a representative group of people without lung cancer. Since smoking is a cause of many non-malignant respiratory diseases, the prevalence of smoking in the controls would be higher than in the target population of possible controls without lung cancer. In consequence, the strength of the association between smoking and lung cancer would be underestimated.
Selection bias can also arise from non-response. Few studies achieve 100% participation from those sampled, and why people agree to take part can often be related to the exposure or disease of interest. People with the disease of interest may be more likely to respond because of their interest in the study results, or for some conditions, they may be less likely to take part because of poor health. In surveys, this can result in the prevalence of the disease being over or under estimated. In studies that aim to determine an association between an exposure and disease, non-response will only result in a biased estimate of association (e.g. odds ratio) if there is a differential effect of exposure status upon the response rates of diseased and non diseased individuals, or a differential effect of disease status upon the response rates of exposed and unexposed individuals. An example would be a case-controlled study of the association between social advantage and breast cancer, in which cases overall had a high response rate, but amongst the controls, those of lower social class tended to have a poorer response rate.
One more type of selection bias which is an issue in cohort longitudinal studies is loss to follow up. In cohort studies you follow people over time. In an ideal world, you would be able to keep everyone in the research study however people are going to drop out for a variety of reasons which could possible bias the results particularly if the reason for leaving the study is related to the exposure or outcome of interest.