Bias and Confounding Flashcards

1
Q

Explain how chance may lead to an apparent association between exposure and outcome that is not true.

A

With any study, there is always the possibility that the observed association simple arose by chance. This is because our measure of association between exposure and outcome (e.g. odds ratio, mean difference) is based on a sample rather than the entire target population and is subject to sampling error. Therefore it is only an estimate of the true value in the target population. This concept of sampling variation was introduced in section 3, and we have seen how we can compute a confidence interval which tells us where the ‘true’ or ‘real’ measure of effect (e.g. the true odds ratio) is likely to lie. This concept of chance is also explored in section 6 when we will also look at how we can assess the probability that an observed association has arisen by chance using p-values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain how reverse causation may lead to an apparent association between exposure and outcome that is not true.

A

Reverse causation is the concept by which instead of the exposure causing the outcome, the outcome actually caused the exposure. Often the onset of a disease, even before a diagnosis has been made, may cause individuals to change their behaviour or lifestyle. For example, some cross-sectional and case-controlled studies have shown a positive association between being underweight and having cancer. But rather than being underweight being a cause of cancer, it may be that when people get cancer, the illness leads to them losing weight. To overcome this problem, it is important to ensure the exposure measurements relate to the period before the disease onset, allowing for the fact that there may be a preclinical stage of the disease and that the person may experience symptoms prior to a diagnosis being made. Reverse causation is primarily a problem in cross-sectional and case-control studies, although can occur in cohort studies if the follow-up period is short.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is confounding?

A

Confounding occurs when the estimate of an exposure outcome is distorted by some other exposure which is related to both the outcome of interest and exposure of interest. For example if we find that coal miners are more likely to develop lung cancer than people on other occupations, rather than coal dust causing lung cancer, the observed association may actually be due to the fact that miners smoke more than the general population and that smoking is the true cause of the increased risk. In to example coal dust is the exposure, lung cancer is the outcome, and smoking is the possible confounded.

For a variable to be a confounder, it must be associated with the exposure of interest and it must be independently associated with the outcome.

Note, a confounder cannot be a factor which is on the casual pathway. For example, blood pressure would not be considered a confounder in the relationship between exercise (exposure) and heart attack (outcome). This is because blood pressure is actually involved in the mechanism by which exercise works to lower the risk of a heart attack - i.e. blood pressure is on the casual pathway between exercise and heart attack.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can confounders distort an association?

A

Usually confounders are investigated to determine whether an apparent association between exposure and disease is still present after allowing for confounding factors. However confounders can work in any direction. Sometimes a confounder actually masks a true relation between exposure and disease, that is, initially there is no apparent relation but after allowing for the confounder an association appears. A confounder can even change the direction of an association completely, that is, an apparent positive association between exposure and disease may become negative when the confounder is allowed for, or vice versa.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What can you do to avoid confounding?

A

Confounding can be dealt with in the following ways, either at the design stage or analysis stage:

1) . Appropriate study design - Randomisation, Restriction, Matching
2) . Analysis methods - Stratification, Multivariate design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe randomisation as an appropriate study design to avoid confounding.

A

The most effective way to eliminate confounders (known and unknown) is by randomisation - i.e. perform a randomised clinical trial. Randomisation works because people are assigned to be either exposed or unexposed in an entirely random way. This ensures that all potential confounding factors (known and unknown) are distributed equally and therefore cannot distort the true association between exposure and outcome. Breaking one part of the triangle is enough to remove the confounding however. It is for this reason that randomised controlled trials provide the strongest evidence for association between exposure and outcome. However it should be noted that for ethical reasons randomisation is not always possible - for example if our exposure of interest is smoking, we cannot randomise some people to smoke and others not to.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe restriction as an appropriate study design to avoid confounding.

A

Confounding can be avoided at the design stage by restricting the study group to only one level of the confounding variable. For example, in a study of occupation disease previous mining example), one might restrict the study to non-smokers only to remove smoking as a confounding factor. This method of dealing with confounding is called restriction. The down side of restriction is that it limits the number of participants in the study and it also limits how generalisable the findings are.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe matching as an appropriate study design to avoid confounding.

A

In case-controlled studies, cases and controls can be matched to one another according to potential confounding factors. For example, in a case-control study to determine whether coffee drinking causes carcinoma of the pancreas, to remove the potential confounding influence of smoking we find a case of cancer of the pancreas who smokes and compare this to a control (person from the same base population who does not have cancer of the pancreas) who also smokes. It is important to note however that matched case-control studies have to be analysed using a particular statistical method whereby the matching factors are taken into account. If this is not done, the results from the study will be biased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe stratification as an appropriate analysis method to avoid confounding.

A

More commonly confounding is controlled in the analysis, although this relies on the relevant information on the confounding factors being collected accurately. Stratification is a widely used technique in which you carry out the analysis and compute a measure of effect separately for each level of the potential cofounder. For example suppose you carry out a case-control study to examine the association between coffee consumption and cancer of the pancreas. When the association is initially examined, the odds ratio is estimated to be 1.9. However, we might think that smoking is a potential confounding factor. Therefore we can separate subjects into smokers and non-smokers and see if the association is still there. If we reanalyse the data we may find that it supports the suggestion that smoking confounds the association between coffee consumption and cancer of the pancreas - and that there is actually no effect of coffee on cancer of the pancreas seen for smokers or non-smokers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe randomisation Multivariate analysis as an appropriate study design to avoid confounding.

A

In practice, researchers often want to allow for the potential confounding effects of many factors, not just one, and more sophisticated techniques have been developed to do this. Again, to use these methods, the information on the potential cofounders needs to have been collected as part of the study design. These techniques effectively ‘adjust’ the estimated measure of effect, e.g. odds ratio, to control for the confounders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which factors might be cofounders?

A

This depends on the exposure and disease under investigation, and it is important to read the literature and identify risk factors for the disease prior to starting the study so that data on them can be collected. The following factors however are important determinants of most diseases and also associated with many exposures, so are a good starting point:

  • Age
  • Gender
  • Socioeconomic status
  • Diet
  • Ethnicity
  • Smoking habit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is bias?

A

Bias is a systematic error in an estimate and results in a measure of effects (e.g. odds ratio or mean difference) which can be either above or below the true value, depending on the nature of the systematic error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the difference between bias and confounding?

A

Bias is a consequence of defect in the design or execution of an epidemiological study. Bias cannot be controlled for in the analysis of a study, whereas you can adjust for confounding if you have measured the confounding variable.

To make clear the difference between bias and confounding, consider two possible interpretations of an epidemiological study which reports finding an association between exposure and disease.

If the association results from bias, this means that the disease is not associated with the exposure in the population under study, the finding is simply wrong.

If the association results from confounding, this means that although the disease is associated with the exposure, some other factor (which is associated with exposure and independently associated with disease) can explain the association.

For example, consider a case-control study which finds an association between coffee drinking and bladder cancer, i.e. coffee drinkers are more likely to have bladder cancer.

If the association arose because of error in the exposure data such that cases with bladder cancer recalled their coffee drinking better than controls, this would be bias.

If smoking is associated with coffee drinking and also associated with an increased risk of bladder cancer, then the observed relation between coffee drinking and bladder cancer could be due to confounding by smoking status.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the two main forms of bias in epidemiological studies?

A

There are two main forms of bias in epidemiological studies:

  • selection bias
  • information bias
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is selection bias?

A

Selection bias arises from defects in the study design relating to how people are chosen to be, or end up in the study. One way this can occur is if the method the researchers use to sample or select the study population results in an unrepresentative population. If the criteria for inclusion in the study is related to the exposure of interest, and this differs between the diseased and non-diseased, then the estimate of effect will be biased. This is a particular issue in case-control studies.

For example, in a case-control study of the effect of smoking on lung cancer, if controls are selected from people who were suffering from non-malignant respiratory disease, this will not be a representative group of people without lung cancer. Since smoking is a cause of many non-malignant respiratory diseases, the prevalence of smoking in the controls would be higher than in the target population of possible controls without lung cancer. In consequence, the strength of the association between smoking and lung cancer would be underestimated.

Selection bias can also arise from non-response. Few studies achieve 100% participation from those sampled, and why people agree to take part can often be related to the exposure or disease of interest. People with the disease of interest may be more likely to respond because of their interest in the study results, or for some conditions, they may be less likely to take part because of poor health. In surveys, this can result in the prevalence of the disease being over or under estimated. In studies that aim to determine an association between an exposure and disease, non-response will only result in a biased estimate of association (e.g. odds ratio) if there is a differential effect of exposure status upon the response rates of diseased and non diseased individuals, or a differential effect of disease status upon the response rates of exposed and unexposed individuals. An example would be a case-controlled study of the association between social advantage and breast cancer, in which cases overall had a high response rate, but amongst the controls, those of lower social class tended to have a poorer response rate.

One more type of selection bias which is an issue in cohort longitudinal studies is loss to follow up. In cohort studies you follow people over time. In an ideal world, you would be able to keep everyone in the research study however people are going to drop out for a variety of reasons which could possible bias the results particularly if the reason for leaving the study is related to the exposure or outcome of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is information bias and how does it occur?

A

Information bias occurs when information is collected incorrectly or inaccurately, or in other words, when study subjects are misclassified according to their disease status, their exposure status, or both. Some reasons for this include problems with the equipment used to make the measurements, mistakes by researchers in recording data, or subjects giving incorrect information in interviews or on questionnaires.

Non-differential misclassification occurs when errors in classification of exposure status affect diseased and non-diseased equally, or when the errors in classification of disease status affect exposed and unexposed individuals equally. The effect of this type of misclassification is always to underestimate the true strength of association between exposure and disease: that is, to bias the odds ratio towards unity (a value of one).

For example, measurements such as height or weight inevitably involve some degree of random error, which means that some heights and weights will be underestimated and some overestimated. This results in an underestimation (odds ratio closer to one than it should be) of an association between exposures such as height and weight and an outcome or disease.

Differential misclassification occurs when errors in classification of disease status are dependant upon exposure status or vice versa.

For example, in a case-controlled study, cases recall of their past exposure to risk factors may differ from the recall of the controls. This differential misclassification can bias the estimates of association in either direction, and hence can be responsible for associations which prove to be spurious.

Three common types of information bias are recall bias, reporting bias and observer (or interviewer bias).

Recall bias is especially a problem in case-control studies, when cases may recall their exposure more completely than controls. In a study of congenital malformations and exposures to drugs in utero, mothers of malformed infants may be more highly motivated to recall use of a drug in early pregnancy than control mothers.

Reporting bias occurs when individuals with a disease are more likely to report exposure (especially if they know or can guess what association the researcher is interested in). For example, parents of children with asthma may be more likely to report living on a busy road (because they expect some association between traffic pollution and their child’s condition) than parents of healthy children.

Observer bias is when the interviewer or researcher knows who the diseased and non-diseased people are and collect exposure information differently for them (or vice versa). For example, in a case-control of occupational exposure to metal dust, the interviewer may probe the cases about their past exposures more than the controls. Or in a cohort study of treatments, the researchers may monitor the treated people more carefully than the untreated and hence record the outcomes more readily in the treated group.

17
Q

Regarding information bias what is non-differential misclassification?

A

Non-differential misclassification occurs when errors in classification of exposure affect diseased and non-diseased equally, or when the errors in classification of disease status affect exposed and non-exposed individuals equally.

The effect of this type of of misclassification is always to underestimate the true strength of association between exposure and disease: that is, to bias the odds ratio towards unity (a value of one).

For example, measurements such as height or weight inevitably involve some degree of random error, which means that some heights and weights will be underestimated and some overestimated. This results in an underestimation (odds ratio closer to one than it should be) of an association between exposure such as height and weight and an outcome or disease.

18
Q

Regarding information bias what is differential misclassification?

A

Differential misclassification occurs when errors in classification of disease status are dependent upon exposure status or vice versa.

For example, in a case-control study, cases recall of their past exposure to risk factors may differ from the recall of the controls. This differential misclassification can bias the estimates of association in either direction, and hence can be responsible for associations which probe to be spurious.

19
Q

Describe the three common types of information bias.

A

Three common types of information bias are recall bias, reporting bias and observer (or interviewer) bias.

Recall bias is especially a problem in case-control studies, when cases may recall their exposure more completely than controls. In a study of congenital malformations and exposures to drugs in utero, mothers of malformed infants may be more highly motivated to recall use of a drug in early pregnancy than control mothers.

Reporting bias occurs when individuals with a disease are more likely to report exposure (especially if they know or can guess what association the researcher is interested in). For example, parents of children with asthma may be more likely to report living on a busy road (because they suspect some association between traffic pollution and their child’s condition) than parents of healthy children.

Observer bias is when the interviewer or researcher knows who the diseased and non-diseased people are and collects exposure information differently for them (or vice versa). For example, in a case-control of occupational exposure to metal dust, the interviewer may probe the cases about their past exposure more than the controls. Or in a cohort study of treatments, the researchers may monitor the treated people more carefully than the untreated and hence record the outcomes more readily than the treated group.

20
Q

Describe how you would go about dealing with bias.

A

It is important to think about potential biases while designing the study in order to try and avoid them. In particular think about how the study population is going to be defined and chosen. In addition, research should use standard methods throughout the study for collecting exposure and outcome variables, and where possible to use an objective measure of exposure and outcome variables.

It is not possible to control for bias in analysis. The only solution is to try to identify possible sources of bias, and to collect information which helps the establish the likely extent and direction of the bias. For instance, in a survey it is often possible to collect some baseline information about the demographic characteristics of non-responders, who can then be compared with the responders to determine whether they differ systematically by social class, age etc. Sometimes non-responders can be followed up for mortality.

21
Q

Give a brief summary of bias.

A

There are two main types of bias:

1) . Selection bias which arises from differences between those people included in a study and those not.
2) . Information bias which arises from error in individual measurements of exposure or disease which can be either differential or non-differential.

To avoid bias careful design and good conduct throughout the duration of the study is required.

When reading or conducting a study it is important to identify what biases may have occurred and how they may affect the findings.

22
Q

What are the four possible explanations for a relationship between exposure and outcome other than being true (casual association)?

A

Other than the truth, there are four possible alternate explanations:

1) . Confounding
2) . Bias
3) . Chance (sampling variation)
4) . Reverse causation

23
Q

What is information bias?

A

Information bias refers to bias and misclassification arising from measurement error as a result of instrument error, providing subjects with the wrong information, or recording the wrong information.

24
Q

What is selection bias?

A

Selection bias is the selection of individuals, groups or data for analysis in such a way that proper randomisation is not achieved, thereby ensuring that the sample is not representative of the population intended to be analysed.

25
Q

What are the different types of bias?

A

Information / measurement bias - caused by errors recording data, reporting and recall bias (inadvertent false responses given by respondents) and observational / investigator bias (leading questions etc). May be differential (may underestimate or overestimate effect) or non-differential (always tends to underestimate the effect).

Sampling bias - caused by non-random selection, dropout, non-response