Confounding Flashcards
What is Confounding?
–Confounding is bias in the estimation of the effect of exposure on disease occurrence, due to a lack of comparability between exposed and unexposed population.
–Occurs when the substitute population is NOT equivalent to the counterfactual condition.
(AKA, the substitute population does not show the outcome in the exposed population WITHOUT the exposure.)
How do we Identify and Control for Empirical Manifestations of Confounding?
Practically, there is no empirical method for directly examining the correctness of the comparability assumption
We search for differences in the distributions of risk factors among the exposure groups = confounding variables
Interaction
Example : smoking, lung cancer, and asbestos
Occurs when the association between the exposure and the disease varies by levels of a third factor
ie when the magnitude of effect is “modified” by varying levels of a 3rd factor
Ex: the association btwn smoking + lung cancer varies by exposure to asbestos.
—(risk is much higher and smoking + asbestos exposures are present together)
Example of Confounding
smoking, lung cancer, and age
If smokers are older than non-smokers, how would we know whether the observed association between smoking + lung cancer is due to smoking or to age?
(if we have adequately measured confounders in all subjects, we can correct or control for their distorting effect in the analysis)
Comparability-based confounding
The observed value of the outcome measure in the exposed group is compared with the expected value that would have been observed in the exposure group if it had not been exposed.
The unexposed group’s actual outcome is used as a proxy for the exposed group’s unobserved value. When the two groups experiences differ, the groups are noncomparable
=confounding occurs
Collapsibility-based confounding
Confounding is a failure of the estimate for an adjusted effect parameter to equal the estimate for the crude parameter that is obtained when a covariate is ignored (collapsed).
Confounding is equated with non-collapsibility (or change in the estimated parameter)
Properties of a Confounder
- A confounder must be the cause of the disease
- A confounder must be associated with the exposure in the base population
- A confounder must not be affected by the exposure or the disease
Property 1:
A Confounder must be the cause of the disease
Meaning?
Covariate must be a risk factor for the disease in the unexposed base population.
OR, it can be a marker for another, often unmeasured risk factor.
Meaning: the association can be observed in the unexposed group in both cohort and case-control studies.
C is not a confounder if:
- -its association with D is due to chance/bias
- -association is due to the effect of D on C
- -the effect of C on D in the base pop. is not independent of the exposure
Property 2:
A confounder must be associated with the exposure in the base population
Covariate must be associated with exposure status in the total base population.
- -Cohort study: association can be observed in the total sample.
- -Case-control study: association can be observed in controls, assuming that it reflects the association in the base population.
- C-E association should be known prior to study, or else we may have to assume that the observed C-E association reflects this association in the base population
Property 1 and 2 depend on:
–Prior knowledge of covariate associations or effects in the base population, which may conflict with associations observed in the data
Property 3:
A confounder must not be affected by the exposure or the disease
C is not a confounder if its association with E in the base population is due entirely to the effect of E on C –even if C is a proxy/risk factor for D
C is not a confounder if:
- C is an intermediate variable in the causal pathway between E + D
- both C + D are affected by the same unmeasured risk factors and C is affected by E
- both C and D are affected by another unmeasured risk factor
*often requires prior knowledge
Positive v Negative Confounding
When X is a Risk Factor:
- -If Ba* Bb are > 0, bias is away from the null (positive)
- -If Ba* Bb < 0, bias is toward the null (negative)
When X is a Protective Factor:
- -If Ba* Bb > 0, bias is toward the null (negative)
- -If Ba* Bb <0, bias is away from the null (positive)
Positive confounding: crude OR/RR > Adjusted OR/RR
Negative confounding: crude OR/RR < Adjusted OR/RR
Stratification Methods:
Mantel-Haenszel method
- Stratify data by the level (i) of a confounder
- Calculate a weighted average of the stratum specific estimates (adjusting)
- Compare crude vs. adj estimates
Advantages and Disadvantages of Stratification Methods
Advantage:
-easy to understand and compute
Disadvantage:
- cannot handle a large number of variables (problematic if there are sparse data in some strata or too many variables to adjust for)
- each calculation requires a rearrangement of tables
- limited to categorical confounders
How to Identify a Confounder
- Prior knowledge
- Change-In-Estimate strategy
Collapsibility-based confounding:
–when the estimate for adjusted effect measure does not equal the estimate for crude effect measure (which is obtained when covariate is ignored/collapsed)
Identifying a Confounder: Prior Knowledge
Prior knowledge of a causal relationship from previous empirical studies, biologic plausibility, or theories/models
existing DAG’s
Identifying a Confounder: Change in Estimate Strategy
A crude effect estimate is compared to the adjusted effect estimate.
- Stratify effect estimates by variable
- If the difference after adjustment is >= 10%, then the variable in question is a confounder
* provided stratum specific measures are homogenous
Magnitude of confounding = (crude - adjusted / adjusted)
Methods of Controlling for Confounding
- Design and conduct of a study (randomization, matching)
- Analytic methods of adjustment
(DAG’s)
Controlling for Confounding:
Design and Conduct of a Study
- Randomization (experiments):
- -controls for all confounders, including those that are unmeasured or unrecognized.
- -since there is no guarantee that randomization has eliminated all bias, especially with small sample bias, other options are used to control for confounding. - Restricting the eligibility of subjects according to values of potential confounders (in any type of study).
- -commonly used in observational studies to control for known confounders.
Matching
Method used to control for confounders in observational studies
- Restrict eligibility of unexposed subjects by making them similar/comparable to exposed subjects with respect to matching variables (confounders)
Matching in Cohort Studies
- -Unexposed subjects are matched to exposed subjects
- -used to prevent confounding due to the matching factors
- -if there is no source of bias other than confounding by the matching factors, statistical adjustment (modeling) for these factors may be unnecessary to remove bias
Matching in Case-Control Studies
Matching is more common in case-control studies, because there is more likely to be a relative shortage of cases than there is to be a shortage of controls
Controls are matched to the cases
- -used to increase statistical efficiency when a subsequent procedure (stratification) is used to adjust for confounding but introduces bias
- -statistical adjustment/modeling for the matching factors may be necessary to remove bias even if they were not originally confounders
Types of Matching:
Individual matching
Frequency matching
Individual Matching
- –One or more unexposed subjects are selected separately for each exposed subject
- —such that each set of unexposed subjects is made similar to the corresponding exposed group on one or more matching variables
Methods for selecting Comparison (Unexposed) Subjects
- Category matching: each set of comparison subjects is selected (preferably randomly) from the same matching category (stratum) to which the exposed subjects belong (ex white males aged 30-34)
- Caliper matching: each set of comparison subjects is selected to have values on the matching variable that are close to the corresponding value of the index subject
–fixed caliper: the tolerance for eligibility is the same for all matched set
–variable caliper: tolerance for eligibility varies among matched sets
(practical -can avoid not getting a match for certain exposed subjects, or can get the best possible match)
Frequency Matching
The total comparison group is selected in such a way as to make the joint distribution of one or matching variables in the unexposed and exposed groups similar
Unexposed subjects are not selected until after all exposed subjects are identified
Ex: freq. matching on age and sex
If 20% of cases are females age 50-54, then controls are selected so that 20% are also females aged 50-54. Must wait for cases to accumulate before controls are selected
Why Match?
In a case-control study, matching on a variable generally introduces selection bias, which must be controlled in data analysis–even if matching factor is not a risk factor for the disease
–so it’s not the matching but the analysis that controls for confounding
In a cohort study, matching does not usually introduce bias, but it reduces confounding by the matching variables.
Adjustment for matching factors is unnecessary if there are not other biases
Matching + Small Sample Size
We use matching to control for confounding more effective when sample size is small
–without matching, controlling for confounding in analysis will result in many strata with sparse data
–balancing the distribution across strata of a matching variable, the effect estimates will be more stable
(smaller standard errors, narrower CI)
Reasons for Matching in Case-Control Studies
The major advantage of matching in case control studies is to make the adjusted estimate of effect for precise for a given sample size
This gain in precision occurs when the matching variable is associated with both exposure and disease in the base population, so we must control for M as a confounder
The major statistical reason for matching is therefore not to control for confounders (done in analysis) but to control for confounding more efficiently than if we had not matched
You need BALANCED STRATA
Matching and Selection Bias in Case Control Studies
In case control studies, selection bias introduced by the matching process can occur whether or not there is confounding by the matched factors in the base population
If there is confounding in the base population, the process of matching will superimpose a selection bias over the initial confounding
Bias generally toward the null, because matching selects controls who are more like cases with respect to exposure than randomly selected controls
Why matching in a case control study introduces bias
If controls are selected to match the cases on a factor that is correlated with the exposure, then the crude exposure frequency in controls will be distorted in direction of similarity to cases
Matched controls are identical to cases with respect to the matched variables. If the matching variable were perfectly correlated with the exposure, the exposure distribution of controls would be identical to that of the cases, therefore crude OR would = 1
So matching usually doesn’t control confounding, but adds a selection bias on top of the existing confounding
How to Maximize Efficiency of Matching
In Cohort Studies - the potential gain in size efficacy from matching is usually greater when the association between the matching variables and exposure status is stronger (because unexposed are matched to exposed)
In Case Control Studies - the potential gain in size efficiency from matching is usually greater when the association btwn matching variable + disease in stronger (bc controls are matched to cases)
Analysis of Matched Data
1. General stratified analysis
Must take matching into consideration through some form of stratification
- General stratified analysis (Mantel Haenszel): most efficient strategy for estimating the effect
- ignore matched sets and re-stratify on all matching variables in the analysis
MH estimates are robust and not affected by small numbers in specific strata
-hard/impossible to control for factors other than matching factors if some strata involve small numbers
Analysis of Matched Data
2. Matched analysis
With caliper matching or a mixture of caliper and category matching, we usually conduct a special type of (matched) stratified analysis
- -treat each matched set as a separate stratum.
- -then estimate and test exposure effect using MH method or model-fitting techniques
- -Preserves the matched sets