Epi Methods 752 Flashcards
Degree to which study is free from bias; inferences from study population reflect inferences that would be observed in target population; prerequisite for external validity
Internal Validity
Includes allocation concealed, masked, common data collection, quality assurance monitoring
Metrics of Study Quality
Degree to which results of study may apply, be relevant, or be generalized to populations/groups that didn’t participate in study; assessing whether internally valid inferences apply to other target populations; representativeness
External Validity
Vague definition of target population, inability to define source population, or problems with process of obtaining study population
Barriers to Internal Validity
Condition –> identify risk factors –> test intervention –> dissemination –> repeat cycle
Process of Studying Health Outcomes
Study design where subsets of defined population identified with exposure to factor(s) hypothesized to influence occurrence of outcome; compare incidences in groups that differ by exposure levels; denominators are typically persons or person-time
Cohort Study
Research question identified, protocol developed, & cohort assembled, then exposure measured, then participants followed for outcomes
Prospective Cohort Study
Exposures & outcome occur and measured for other purpose, then research question identified, protocol developed, & cohort assembled
Retrospective Cohort Study
Cohort that can gain & lose members over time; fixed or dynamic
Open Cohort
Cohort cannot gain members after defined time/event & loses members only to outcome or end of study
Closed Cohort
Evaluate exposure with many outcomes, temporality, calculate risk, time-varying effects; expensive, can take long time to conduct, not efficient for long disease process/rare outcome, may not be warranted for rare exposure
Pros & Cons of Cohort
Enroll all individuals who meet eligibility criteria; non-probabilistic & may not reflect target population
Convenience Sample
All individuals have known probability of selection; must be able to enumerate population
Probability Sample
Probability sample, random start then sample every “nth” unit
Systematic Sample
Probability sample, divide population into homogenous strata & select random sample within each strata
Stratified Random Sample
Probability sample, divide population into heterogeneous clusters, randomly sample clusters, & measure within chosen cluster
Cluster Sample
Immigrative selection bias by self-referral, immigrative non-response bias, emigrative loss to follow-up
Selection Bias in Cohort Study
Exposure time that doesn’t biologically contribute to outcome (after etiologically relevant time window for exposure); including time in analysis dilutes effect of exposure
Wasted Exposure
Collect follow-up data by linkage to external systems
Passive Follow-Up
Collect follow-up data by interaction with participants or proxies
Active Follow-Up
Enumerate group of at risk individuals followed for outcome
Cohort
Cohort members still at risk for outcome at time of event; aligned by time origin & time metric
Risk Set
Persons credited for time at risk during which effect cannot possibly occur; including time in analysis creates artificially lower event rate; typically occurs when cohort entry dependent on survival
Immortal Person Time
Persons continue to contribute person-time to cohort after death due to imperfect mortality ascertainment; typically occurs in aging cohorts
Ghost Time
Planned experiment designed to assess efficacy of treatment by comparing outcomes in comparable groups receiving intervention or control, participants in both groups enrolled, treated, & followed over same time period
Randomized Clinical Trial
Genuine uncertainty within professional community as to which of 2 treatment arms is superior; justification for randomization
Clinical Equipoise
Process where treatment follows no describable deterministic pattern but a probabilistic pattern; exposure decision removed from participant/provider
Randomization
Avoids selection bias (confounding by indication), groups should be similar by baseline characteristics (exchangeability), valid significance levels for statistical tests, defined time origin
Pros of Randomization
Restriction, restricted/adaptive randomization, or adjustment for baseline characteristics
Control for Confounding in RCT
Process for preventing disclosure of treatment assignments to participants or study staff; prevents immigrative selection bias; occurs at enrollment
Concealment of Random Allocation
Treatment assignment not known after randomization; prevents differential measurement error; can be single, double, or triple; occurs during course of trial
Masking
Each new assignment independent; number of patients & characteristics in groups should be equal in long run (potentially not in small trials)
Simple Randomization
Blocking ensures number of individuals in each treatment arm is equal; stratification ensures number of individuals in each treatment arm is equal by confounder
Restricted Randomization
Current group composition influences next allocation
Adaptive Randomization
Adaptive randomization where software chooses treatment assignment to yield smallest imbalance
Minimization
Mix of effect & sample size and doesn’t prove causality, show factor is confounder, or consider definitions of confounding
Limitations of Statistical Testing
Analysis of RCT according to randomization irrespective of what happens afterwards; keeps randomization intact but may blur treatment effects (conservative)
Intention to Treat Analysis
Analysis of RCT according to treatment participants received; may overestimate treatment effects relative to “real world” effects; often secondary analysis (primary analysis in some non-inferiority trials)
Per Protocol Analysis
Immigrative selection bias (insufficient concealment of random allocation), emigrative selection bias (differential loss to follow-up), information bias (differential measurement error due to insufficient masking), reporting bias
Bias in RCTs
More than 1 treatment, examine treatments independently & together; indicated if 2 treatments act independently or influence each other
Factorial 2x2 Trial
Designed to show test intervention not inferior to comparison by margin of non-inferiority (Δ); tests Ho that treatment effects differ between groups
Non-Inferiority Trial
Designed to show test intervention equivalent to comparison (-Δ to +Δ)
Equivalence Trial
Tests Ho of no difference between treatment groups
Superiority Trial
Unit of randomization is group not individual; useful when intervention cannot be easily isolated; account for correlations in analysis
Cluster Trial
Participants randomized to treatment, then measurement of outcomes & washout, then receive opposite treatment, then measurement of outcomes; each subject is own control (closest to exchangeability); condition must be stable, intervention must not cause permanent change, outcome must be repeatable, carry-over effects must be small, drop out must be low
Cross-Over Trial
Use accumulating data to decide how to modify aspects of trial
Adaptive Trial
Randomization used to achieve balance across prognostic factors at baseline; equal allocation probabilities; stratified randomization for confounding
Fixed Allocation Rule
Randomization used interim data to unbalance allocation probabilities in favor of “better” treatments (“playing the winner”)
Adaptive Allocation Rule
Occurs earlier in disease process; should have strong consistent association with clinical outcome (in causal pathway) & yield same inference as outcome
Surrogate Outcome
More frequent events, shorter time; potential inconsistent relationship with outcome, may not be reliable indicator of treatment effects
Pros & Cons of Surrogate Outcome
Outcomes must be of similar importance, incidence, & effect; reduces sample size, useful if no obvious primary outcome, measure for common underlying mechanism, reduces multi-dimensionality
Composite Outcome
Study individuals with & without disease; examine relationship of exposure by comparing diseased & non-diseased subjects with regard to frequency of exposure
Case-Control Study
Individuals with disease, representative of individuals with disease in source population; individuals who could have had disease but didn’t, representative of individuals without disease in source population; do not select based on exposure status
Cases vs. Controls
Shorter time period, assess multiple exposures for 1 outcome, efficient for outcomes with long latency period/rare outcome
Pros of Case-Control Study
Immigrative selection bias due to diagnostic bias, Berkson’s bias, self-selection, non-response, healthy worker, survival bias (often exclude prevalent cases)
Bias in Case-Control Studies
Data collection occurs after outcome has developed (historical information) & at one time; recall bias possible
Measuring Exposure in Case-Control Study
Cases remember exposure differently than controls; if cases remember exposure better & controls underreport exposure –> increase A or decrease B then increase OR
Recall Bias
OR exposure = OR disease = 1/OR non-disease
Invariance of Odds Ratio
OR approximates IRR when . . . ; sampling of cases & controls must be independent of exposure
Rare Disease Approximation
Select cases & controls to be similar based on strong confounder; not useful to have more than 1:4 ratio of cases to controls
Matched Case-Control Study
Calculate matched OR (otherwise effect underestimated) using conditional logistic regression; cannot examine effect of matched factor(s) but controls for confounding of matched factor(s); avoid over-matching; increases internal validity but may decrease external validity
Matched Case-Control Analysis
Case-comparison at time of event; source population well-characterized; use existing data; often used to assess invasive/expensive exposure
Nested Study
Incidence density sampling to match for time at risk; randomly select controls (still at risk) at each time case occurs; can also match on confounders; efficient for time-varying exposures
Nested Case-Control Study
Select sub-cohort at baseline & analyze as prospective cohort; compare cases to all controls (still at risk) in sub-cohort; cases not in sub-cohort “pop” into analysis just for time of event; efficient for rare outcomes & time-fixed exposures
Case-Cohort Study
Case-only study in which case serves as own matched control; useful for when brief/transient exposure triggers rise in risk of outcome with acute onset; exposure status assessed at different times & compared to exposure status at time of event; analyze using conditional logistic regression
Case-Crossover Study
Examines relationship between outcome & exposure in defined population at 1 particular time; describes prevalence of outcome or exposure; sampling not guided by disease status; useful to determine burden of disease, prevalence of risk factors, or monitor community over time
Cross-Sectional Study
Limited for causal inference, no temporality, information bias (recall bias), selection bias (participation bias, survival bias), reverse causality, relevant exposure window, inefficient for rare exposure or outcome
Limitations of Cross-Sectional Study
Units of analysis are groups not individuals; conclusion may not apply to individuals (ecologic fallacy)
Ecologic Study
Not causality checklist; strength, consistency, temporality, biological gradient, experiment, analogy, specificity, & plausibility
Bradford Hill Criteria
Minimal set of conditions & events sufficient for outcome to occur; conceptualized using Rothman’s pies
Sufficient Cause
Particular type of component cause required for outcome to occur
Necessary Cause
Theoretical comparison group; compare rate of outcome if population exposed to rate of outcome if same population unexposed
Counterfactual
Measured value = true value + error = true value + bias + random error
Measurement Error Model
Systematic difference between true & measured value; assessed by comparing to gold standard
Bias
Error not due to systematic measurement error; assessed by performing repeated measurements on same person/sample
Random Error
Measurement error in variable in question doesn’t depend on levels of other variables (e.g. accuracy for measuring outcome same in exposed & unexposed)
Non-Differential Error
Measurement error in variable in question depends on levels of other variables (e.g. accuracy for measuring outcome different in exposed & unexposed); bias can go in any direction
Differential Error
Measurement error in variable in question not associated with errors in measuring other variables
Independent Error
Measurement error in variable in question associated with errors in measuring other variables (e.g. both variables derived from same questionnaire)
Dependent Error
Sensitivity/specificity, kappa
Quantify Measurement Error for Dichotomous Variables
Spearman correlation coefficient, kappa
Quantify Measurement Error for Categorical Variables
Coefficient of variation, intraclass correlation coefficient
Quantify Measurement Error for Continuous Variables
Variance between individuals/(variance between individuals + variance within individuals)
Intraclass Correlation Coefficient
SD replicates/mean replicates * 100
Coefficient of Variation
Se + Sp > 1.0, effect estimate attenuated
Informative Test
Se + Sp = 1.0, null effect estimate
Uninformative Test
Se + Sp < 1.0, effect estimate flipped
Misinformative Test
Phenomenon that if continuous variable is extreme on first measurement, it will tend to be closer to average on subsequent measurements; may result from intra-individual variability or random error
Regression Toward the Mean
Expression of how close measurement is to true value; opposite of bias; assessed using sensitivity/specificity, correlation coefficient, scatterplot, Bland-Altman plot
Validity
Ability of test to correctly identify those who have disease –> proportion correctly classified as having disease by measure compared with gold standard
Sensitivity
Ability of test to correctly identify those who don’t have disease –> proportion correctly classified as not having disease by measure compared with gold standard
Specificity
Probability of having disease given results of test; depends on sensitivity, specificity, & prevalence of condition in population
Predictive Value
Proportion of persons with positive test result defined as having condition
Positive Predictive Value
Proportion of persons with negative test result defined as not having condition
Negative Predictive Value
Measures how close data is to line of best fit (not line of agreement); measures linear trends
Pearson’s Correlation Coefficient
Special case of Pearson’s correlation for ordinal factor, non-linear relationship, or skewed data; measures increasing or decreasing trends
Spearman’s Rank Correlation Coefficient
Expression for how precise measurement are; assessed by performing repeated measurements & calculating percent agreement, percent positive agreement, kappa, correlation coefficient, coefficient of variation, scatterplot, or Bland-Altman plot
Reliability
Sum of agreement cells/total # individuals; can be heavily weighted by individuals classified as negative on both; does not take chance into account
Percent Agreement
(Observed agreement - expected agreement)/(100-expected agreement)
Kappa
Biased estimate of exposure-outcome association resulting from selection of study participants as effect of exposure & outcome; exposure-outcome association conditioned on study participation
Selection Bias
Immigrative selection bias; issue with detection/classification of cases & non-cases
Ascertainment Bias
Ascertainment bias; physician aware of possible associations between exposure & outcome –> follows exposed persons more closely
Diagnostic Bias
Ascertainment bias; combination of exposure & disease increases risk of admission to hospital –> exposure & disease co-occur in hospital setting
Berkson’s Bias
Ascertainment bias; participants had to survive up to certain time to be sampled
Survival Bias
Immigrative selection bias; issue with how individuals end up in study
Participation Bias
Participation bias; volunteers different from non-volunteers
Self-Selection Bias
Participation bias; responders different from non-responders
Non-Response Bias
Participation bias; people who are employed are healthier than unemployed & workers who continue working (more exposed) are healthier than those who stop working
Healthy Worker Effect
Emigrative selection bias with composition of study population changing relative to source population
Differential Loss to Follow-Up
Differential LTFU; susceptible participants more likely to die or drop-out
Depletion of Susceptibles
Differential LTFU; event whose occurrence precludes occurrence of another event or alters probability of occurrence of event
Competing Risks
Differential LTFU; participant not trackable, didn’t adhere to protocol, did not complete follow-up, withdrew, etc.
Dropout
Effect of exposure of interest mixed together with effect of another variable leading to incorrect effect estimate
Mixing of Effects Definition
Factor is associated with exposure, risk factor for outcome, & not intermediate step in causal pathway
Classical Definition
Effect is difference in outcome caused by different exposure states in one study population during one time period; formalized using potential outcomes
Counterfactual Definition
Effect is homogenous across strata defined by factor & crude effect estimate different from adjusted estimate by > 10%
Collapsibility Definition
Overestimation of effect (away from null); adjustment weakens estimate
Anticonservative Confounding
Underestimation of effect (toward null); adjustment strengthens estimate
Conservative Confounding
Inference crosses null or reaches null
Qualitative Confounding
Confounder-exposure & confounder-outcome associations either both + or -
Positive Confounding
Confounder-exposure association + & confounder-outcome association -, or vice versa
Negative Confounding
Restriction, matching, or randomization
Control for Confounding at Design/Conduct Stage
Sum of stratum-specific rates weighted by person-time distribution of standard population; ∑(Tk)(IkA)/∑Tk; distribution of standard population but comparable
Direct Standardization
Calculate expected number of events applying stratum-specific rates of standard population to observed population weights; observed/expected; same distribution of population of interest but not comparable
Indirect Standardization
Partition sample according to confounder/EMM, calculate stratum-specific estimates, compare to crude estimate
Stratified Analysis
Weighted average of stratum-specific measures; ∑(lnORk)*(weightk)/∑(weightk); use if p>0.05 for test of heterogeneity
Generalized Inverse Variance Method
Weighted average of stratum-specific measures; ∑(AkDk/Nk)/∑(BkCk/Nk); use if p>0.05 for test of heterogeneity; better statistical properties, but can only use for a small number of categorical confounders
Mantel-Haenszel Method
2 or more risk factors modify effect of each other with regard to outcome; heterogeneity of effects
Effect Measure Modifier
Each stratified effect measure suggests increased or decreased risk but of different magnitudes
Quantitative EMM
One stratified effect measure suggests increased risk & other suggests decreased risk, or one is null
Qualitative EMM
IRRxy different from IRRy|not x * IRRx|not y; sub or supra
Multiplicative EMM
IRDxy different from IRRy|not x + IRRx|not y; sub or supra
Additive EMM
Difference of risk differences expressed as proportion of reference risk; measure of departure from additivity obtained from multiplicative models; RR11-RR01-RR10+1
Relative Excess Risk due to Interaction
Prevalence of exposure decreases, different sensitivity/specificity pair, determined by specificity for low prevalence vs. sensitivity for high prevalence
Factors Increasing Bias