Bias and confounding Flashcards
Bias - definition, types (3)
Any systematic error in the design, conduct or analysis of a study that results in the mistaken estimate of an exposures effect on the risk of disease. Impact is on internal validity.
Types:
- Selection bias
- Information bias
- Confounding bias
Internal validity - definition
How well the findings from a study depict the situation in the source population. An internally valid study is on in which selection and information bias have been prevented (in the design phase), and confounding bias has been prevented or controlled for in the analysis.
External validity - definition
How well the results of a study can be extrapolated/inferred beyond the source population (i.e. the target population).
Selection bias - definition, effect, examples (6)
Occurs when the composition of the study population differs with respect to the distirbution of exposure/outcome in the source population. Outcome is that association between exposure and outcome among those selected for analysis differs from the association among those in the source population.
Arises due to factors affecting selection/participation/ retention of study subjects:
- Non-response - association between disease and exposure in the responders differs to that in non-responders, hence association in study population differs to that in source population. e.g. non-response could be an indicator of management, feeding or housing differences that could relate to both the exposure and the outcome
- Selective entry/survival bias - humans/animals that are available to participate in a study may differ to source population e.g. healthy worker effect, assessment of calving to conception rates as measure of fertility (excludes cows which did not get pregnant)
- Detection bias - probability of detecting disease differs by exposure status (case-control studies only; in cohort study this is best viewed as information bias)
- Admission risk bias (Berkson’s bias) - occurs when both exposure and outcome are related to the risk of being in the source population (e.g. hospitalization rates higher in people with cancer and cardiovascular disease)
- Loss to follow up bias - diffferential LTF related to the exposure and disease status
- Missing data - causes bias if lost in a way that is related to the exposure and disease status (does not cause bias if it is random with regard to the exposure and disease status) Detection 1. Compare response rates/LTF between exposed/unexposed (cohort) and diseases/non-diseased (case-control) 2. Compare characteristics of responders/non-responders or retained/LTF (as data permits) 3. Ask: does exposure to a factor increase likelihood of subject being in a study? 4. Ask: is the likelihood of inclusion of cases and controls in the study population the same?
Selection bias - detection
- Ascertain extent of non-response/LTF in each group (exposed/unexposed, cases/controls)
- Consider whether subjects enrolled in study are just those that currently have exposure/condition vs ever had exposure/condition e.g. horses which have ever raced vs currently racing (select for animals that are healthier in latter)
Selection bias - control (3)
Control/prevention only possible in the design phase:
- Don’t mention exposure and outcome in recruitment materials
- Blind recruiter to study hypotheses
- Cohort: stay in touch with study subjects, if they leave identify reason
Information bias - definition, examples (2)
Occurs when subjects are incorrectly classified with respect to their exposure/outcome status. Includes both misclassification bias (categorical data), measurement error (continuous data). Recall bias is a common reason for misclassification bias.
Important to distinguish:
- Non-differential (random) misclassification of outcome/exposure - generally biases results toward null hypothesis
- Differential (non-random) misclassification of outcome/exposure - biases results in either direction
Information bias - control/prevention (5)
Control/prevention only possible in the design phase:
- Use best possible (validated) instrument/tool/test to collect information
- Use objective measures/avoid self-reported information/seek confirmation of information e.g. lab tests, medical records
- Standardize assessment procedures between groups - use clear and explicit guidelines (e.g. definitions)
- Blinding - person doing data collection should be blinded to exposure/outcome status of subject
Confounding - definition
Occurs when the observed association between the exposure and outcome of interest is actually due to another factor or factors.
Counfounders are:
- Risk factors for the outcome
- Associated with the exposure
- Not in the causal chain
Confounding - detection
- Stratification: Said to exist when there is a considerable (>20-30%) difference between crude and adjusted measures of association (stratification).
- Regression model-based approaches:
Confounding - control/prevention (4)
Design phase:
- Randomization - in theory, groups are balanced with respect to potential confounders (must check)
- Restricting - enrol only 1 level of a factor e.g. only one breed, one gender
- Matching - enrol subjects in such a way that extraneous factors is balanced across groups e.g. matching on breed, age, sex, parity, lactation status. Note: Matching can introduce a selection bias into case-control studies (if factor is strongly associated with exposure [as is the case for a confounding variable] and we match on that factor then we alter the distribution of the exposure in cases and controls). Requires
Analysis phase:
- Stratification - stratify data based on potential confounder, if stratum-specific measures are approx equal, then obtain adjusted odds ratio (e.g. Mantel-Haenszel procedure - OR is weighted average across strata)
- Multivariate analysis
Interaction - definition, implications (2)
Occurs when the incidence of disease in the presence of two or more risk factors differs from the incidence expected to result from their individual effects. In other words, the joint effect of the 2 factors is not what would be predicted based on the singular effects of each factor (the effects of one depends on the other).
Implications of interaction:
- Synergism increases disease risk beyond expected; persons with one exposure are more susceptible to another exposure
- Antagonism decreases disease risk beyond expected; persons with one exposure are less susceptible to another
Interaction - models (2)
- Additive model: interaction exists when AR (attributable risk)) related to A alone and the AR related to B alone do not sum up to the AR related to AB (expected count for AB = A + B - baseline)
- Multiplicative model: interaction exists when RR of A alone multiplied by the RR of B alone do not equal RR of AB (to get expected incidence for AB: divide by baseline, multiply A and B, multiply by baseline)