Error & Bias in Epidemiological Studies Flashcards
How are the 2 types of error related to precision and accuracy?
- RANDOM ERROR due to chance has low precision but is accurate
- SYSTEMATIC ERROR not due to chance has low accuracy but is consistent (has a bias)
What is accuracy? Bias?
ACCURACY = whether there is agreement between a measurement made on an object and its true value
BIAS = difference between the average measurements made on the same object and its true value (not accurate = bias)
What are the 3 major types of bias?
- SELECTION BIAS - related to procedures used to selec units for the study —> study groups differed from source population
- INFORMATION BIAS - misclassification related to the information recorded for the study where units are incorrectly assigned positive/negative exposure or disease
- CONFOUNDING BIAS - some other factor changes or distorts the effect of exposure on the outcome (diseased vs. non-diseased)
What are the consequences of bias like in descriptive and explanatory studies?
DESCRIPTIVE = outcome only affected —> higher or lower estimates of disease frequency
EXPLANATORY = outcome and explanatory variables taken into account —> altered disease frequency moves effect estimate towards or away from the null (no statistical difference)
What does it mean to move toward or away from the null? Which is better?
TOWARD = bias causes the study to observe no real effect of exposure/risk factor on disease status
AWAY = bias causes the study to observe that exposure/risk factors have more effect on disease status than they actually do
TOWARD NULL - better to underestimate and do further studies
How is magnitude of bias calculated?
difficult —> sensitivity simulations can help
What is surveillance bias?
selection bias where mild/subclinical disease is more likely to be detected in animals under frequent medical surveillance and/or enrolled in surveillance programs
What is referral bias (admission risk bias/Berkson’s fallacy)?
selection bias where differential referral patterns are a source of bias in hospital-based case-control studies
- is the hospital population representative of the whole population?
- socioeconomic representation in hospital population
What is non-response bias?
selection bias where >20-30% of non-responses or refusal to participate in a study may contribute a bias
- only accounts for passionate people in the study
- high overall response = less bias
What is missing data bias?
selection bias where >20-30% of data is missing and an accurate result cannot be attained
- not enough serum from blood draw to run a test
- missing data does not confirm non-diseased state
What is loss to follow-up bias?
participants are dropping out of a study, which can alter the new group, making them less representative of the actual population
What is selective sentry (survival) bias?
traits are naturally selected when choosing a group of subjects and treatments that prolong lifespan increase prevalence of disease
- “healthy worker” effect in occupational health studies - those working tend to be more healthy than unemployed
In what 4 ways can selection bias be reduced? How can it NOT be corrected?
- random sampling - assesses probability of bias by distributing risk factors equally between groups
- maximize response rates - questionnaires are enticing to get more participants to respond
- minimize withdrawal rates - keep participants in study
- ensure equal responses/withdrawals from exposed/non-exposed and diseased/non-diseased
analytical techniques
How can selection bias be reduced in observational studies?
consider the forces at play with selecting individuals
- case-control: use incidental cases and get controls from the same source population as the cases
- cohort: persistent follow-up with creative strategies for maintaining full participation
How can selection bias be reduced in controlled trials?
RANDOMIZE allocation to intervention and comparison groups and BLIND recruiters and participants to allocation (exposed vs non-exposed), while minimizing withdrawals and maximizing retention
(randomize and blind…everything should be fine)
What is recall bias?
information bias where cases are better at recalling past exposure compared with non-cases
- Salmonella + cases are likely to remember what they last ate compared to Salmonella - cases
What is interview bias?
information bias where interviewers are privy to the hypothesis under investigation
- more likely to ask leading questions to support hypothesis
What is obsequiousness bias (Clever Hans effect)?
information bias where subjects systematically alter responses toward perceived desirable answers
- people commonly change answers based on welfare and hygiene/sanitation based on what should be
- Hans was a trained horse who could supposedly perform arithmetic, but it was found tat he was getting non-verbal cues from his trainer on when to stop stomping his hoof
What is the non-differential consequence of information bias (misclassification) like? How does it compare to null?
systematic errors in one group are independent of the other group where there are equal amounts of systemic error in E regardless of D status and systemic error in D regardless of E —> best possible bias
errs toward null —> decreased power and ability to find an effect
What is the differential consequence of information bias (misclassification) like? How does it compare to null?
systematic error occurs to a greater extent in one group than another —> unequal amount of systemic error in E and D DEPENDING on D and E status —> worst bias, throws off RR and OR
err in any direction (toward or away from null) —> unbalanced link to disease status
What are 4 ways to reduce information bias (misclassification)?
- E and D status should be assessed independently - should be blind to the status of cohorts, cases, and controls
- use rigorous and valid methods for determining D and E - explicit case definitions, best available test + confirmatory test, measure specific exposures (not general)
- use complete and detailed sources of information - complete exposure histories with as much info as possible
- use objective measures when available - no leading questions, clear cut answers
How can interviews and questionnaires be used to reduce information bias (misclassification)?
- minimize time between diagnosis and questioning
- use validated survey instruments (pilot study to test question clarity and detail level)
- standardized interview protocols with clear guidelines
- well-trained qualified interviewers vs. mail/phone
- state/demonstrate clear confidentiality of information
How can information bias (misclassification) be corrected after the study? Why should this be done carefully? What way cannot be used?
validation study where a sub-sample from the study is used to verify classification of E and D and post-hoc adjustments
very sensitive to changes in estimates —> much better to prevent information bias than to correct it
analytical techniques
How can information bias (misclassification) be reduced in observational studies?
CASE-CONTROL: explicit definitions for cases, determining E status independent from D status, interview as soon as possible
COHORT: determining D status independent from E status, valid method and objective measures for determining D status
How can information bias (misclassification) be reduced in controlled trials?
- blind to intervention allocation (E+ vs E-) to prevent D status from being influenced by E status by interviewers and participants
- valid method and objective measures for determining outcome (D status)
Selection and information bias re-cap:
What are confounders?
a third factor that distorts the true underlying relationship between an exposure and an outcome of interest
How do confounders compare to outcome and exposure?
causally associated with the outcome in non-exposed animals
non-causally associated with exposure and both are on 2 separate causal pathways to the outcome
What are classical confounders? How do they affect the exposure and outcome?
age, sex, breed/species, weight/BCS, location, herd size
will not be affected by either - being raised on grass will not make cows younger
Age-specific comparison of death from all cases for Tolbutamide and Placebo treatment groups:
- OUTCOME: survival during follow-up period = 409
- EXPOSURE: treatment (Tolb. [204] vs Placebo [205])
- CONFOUNDER: age (< 55y = 226; >55y = 183)
How can it be confirmed that age is a confounder?
- must be associated with the outcome in the non-exposed: 18.8 > 4.2 RR in placebo groups
- must be associated with exposure: despite randomization, there was still a difference of age by treatment (<55y on Tolb = 52%; <55y Placebo = 59%)
- must be on separate causal pathways: drug doesn’t change age of participants
How can you decide what RR to report on studies with confounders?
compare difference between Mantel-Haenszel combined OR of strata of confounders and the crude OR
- > 20-30% = significant confounder, should report combined OR
- <20-30% not significant and can report the RR of all participants
Tolbutamide and death study, Mantel-Haenszel vs crude OR:
How can confounding bias be controlled before the study begins?
- restriction (exclusion): purposefully restrict study to a specific group of individuals (loses generalizability, or external validity, since only one group is being studied/reported)
- randomization: randomize allocation to E+ and E- groups in a controlled trial to produce very similar groups
How can matching be used in cohort and case-control studies to decreased confounding bias? What are disadvantages of each?
COHORT - match confounders by E (E+ individual is male, find a E- male) —> unable to estimate the effect of the matched factor (male) and may affect global surrogate factors and match-out multiple factors
CASE-CONTROL - match confounders by D (D+ individual is male, find a D- male) —> matching WILL NOT control confounding in these studies since exposure is unknown, but it can increase power
What is the best way to control confounding bias? What are 3 examples?
ANALYTICAL MODELS
- standardization (human) - adjustment to an external standard
- STRATIFICATION - analysis within each strata separately and use Mantel-Haenszel to make a summary across strata
- MULTIVARIABLE - control for multiple factors using linear (continuous) and logistic (dichotomous) regressions
How can analytical (statistical) control for confounding be detected in studies?
- UNIVARIABLE = univariate, unconditional associations, raw, crude, bivariate logistic regression = no control for confounding
- STRATIFICATION: M-H OR, adjusted OR, multivariable, = confounding accounted for
- MULTIVARIABLE REGRESSION: multivariable, conditional regression, confounders/factors are included in the same model (already built-in, not reported)
How does confounding compare to interaction?
CONFOUNDING = third factor distorts the true underlying relationship between an exposure and an outcome
INTERACTION = third factor is necessary to explain the relationship between exposure and outcome - outcome DEPENDS on both exposure and interaction factor
What are the 2 types of interactions?
- synergistic (positive) - joint effect is greater than the sum of independent (factor) effects (E + I = greater O compared to no I)
- antagonistic (negative) - joint effect is less than the sum of independent factor (E + I = less O compared to no I)
What is the test of homogeneity? What is the consequence of heterogeneity?
formal statistical test to see if interaction exists between strata depending on the participants within them
- P > 0.05 = homogeneity = no difference among strata
- P < 0.05 = heterogeneity = difference among strata
need to use strata-specific estimates (RR or OR for each stratum)
Interaction of Neomycin (G-) and Cloxacillin (G+) usage on Nocardia infection (G+):
How a decision to report confounding and interactions be made?
Does the factor interact with the exposure?
- YES = test homogeneity p < 0.05 (M-H) and report separate measures of associated for each level of the factor
- NO = test homogeneity p > 0.05 (M-H)….
Does the factor confound the relationship between exposure and outcome?
- YES = >20-30% change between crude and M-H estimates = report summary measure of association, adjusting for the presence of the factor
- NO = ignore the factor
Interactions vs. Confounding:
What results should be reported?
p > 0.05 —> HOMOGENOUS = no interaction, may be confounding
crude vs. M-H = (1.56 - 2.32)/1.56 = 0.487 = 48.7%
- > 20% = confirmed confounder, report summary measure of association adjusting for the presence of the factor - 2.32
What results should be reported?
p < 0.05 —> HETEROGENOUS = interaction
report separate measures of association for each level of the factor - 0.67 for puppy, 2.22 for adult —> age interacts with food to cause obesity
What results should be reported?
p > 0.05 —> HOMOGENOUS = no interaction, may be confounding
crude vs. M-H = (1.56 - 1.63)/1.56 = 0.045 = 4.5%
- < 20% = factor is not a confounder and can be ignored to report the crude OR - 1.56 (no significant difference compared to 1.63)