Midterm Deck: Sessions 1-7 Flashcards
What is considered the purview of environmental epi?
Anything external to your skin about which you have no autonomous choice
Pros to dietary surveys (6)
- ecological overview of consumption vs. crop growth
- easy to do
- cheap to do
- individual level data
- culturally appropriate
- Can play with info over the life course
Cons to dietary surveys (3)
- Recall bias (cases and controls remember exposure differently)
- Measurement error
- Social desirability bias
Pros to crop sampling (4)
- Valid lab data
- Geographical breadth
- Cheap
- Easy
Cons to crop sampling (5)
- Lab Analyses can be expensive
- Could have lab error
- Timing Issues
- Can be ecologic and not specific
- Doesn’t take into account food prep or storage
Pros to Biomarkers (3)
- Know bioeffective does
- Gold standard for exposure
- Good quantitative data
Cons to Biomarkers (8)
- Expensive
- Complicated (logistically)
- Ethically questionable
- Invasive (may lead to participant selection bias)
- Timing may affect sampling (level of biomarker may shift in sample over time)
- Measurement Error
- Batch Effects/Freeze/Thaw bias
- Construct problem
What were the three primary exposure sampling methods used to deconstruct the afflatoxin story?
- food intake/dietary survey
- Biomarker Sampling
- Crop Sampling
How does afflatoxin cause cancer?
o Afflatoxin causes cancer by covalently binding to Guanine in DNA as it turns a nonwater soluble compound into a water soluble compound. → afflatoxin dna adduct → so afflatoxin causes cancer bc it damages dna and causes a mutation.
• Gene P53 is mutated in ½ of all cancer patients; it’s a key gene in cell replicatin. If P53 is damaged, it’s a key step in carcinogenesis
• Mutations in P53 and Mutations in guanine in codon 249 were often linked to afflatoxin contamination.
Construct Validity Problem - what is it, and where do we see it?
Construct validity is “the degree to which a test measures what it claims, or purports, to be measuring.”
So, for biomarker tests, you have to be careful that your test is actually measuring what you think it measures…ie. afflatoxin presence in urine may mean that u metabolize it better…not that you’ve eaten more of it… so you have to be careful about what your measurement is actually telling you in environmental epi.
Latency Period
Def: period of time between onset and diagnosis
o Can involve years or decades → must often assess historical exposures
o Exposures that occur during the latency period may not be relevant → is important to know the natural progression of disease
Name 4 nuances of exposure assessment
1) Latency
2) Incomplete Data
3) Exposure Metric
4) Interactions
Describe “Nuances of exposure assessment” # 1:
Latency: is the period of time between onset and diagnosis
• Can involve years or decades → must often assess historical exposures
• Exposures that occur during the latency period may not be relevant → is important to know the natural progression of disease
Describe “Nuances of exposure assessment” #2:
Incomplete Data
often individual level data is not available, must often work with indexes or exposure scales (job title; proximity to a source)
Describe “nuances of exposure assessment” #3
Exposure Metric Variation:
mean vs. cumulative (i.e. cig pack years over time) vs. peak (acute, when damage only happens above a certain threshold) vs. lagged (exposure today doesn’t impact disease risk tomorrow, it impacts disease risk years form now) exposure
Describe “nuances of exposure assessment” #4:
Interactions
Failure to consider interactions can hide relationships…as interactions can have effect modification on outcomes (ie a given level of exposure has a different effect if you do/do not have another issue). Interactions can take place amongst genetic susceptibility, age, and concurrent disease, etc.
Why is it key to take interactions into account when assessing exposure?
This is key because protections should be set up to help the most vulnerable (i.e. those impacted by interactions), not the average person (so if old people are impacted by an exposure more than young, that should be taken into account)
8 ways to assess exposure amounts
o 1) Environmental Monitoring o 2) Environmental Modeling o 3) Questionnaires and Job Records o 4) Biomarkers o 6) The Exposome o 7) Complex Mixtures o 8) Enviroment Wide Association Studies (EWAS)
Environmental Monitoring Represents & Requires…
• Represents exposure level but not dose absorbed by the individual
• Monitoring is expensive, and requires expert assistance in set-up, quality control, and analyses
* Must make sure the monitoring system is aesthetically and culturally appropriate for the study population.
Ambient Air monitoring is usually… ?
• Ambient is usually for a zone or room while micro-environments might be more important
Personal air monitoring can place ….?
• Personal monitoring can place a high burden on participants and is usually only a snap shot
Occupational routine air monitoring nuances
• In Occupational settings routine monitoring is often non-random & is only in “trouble spots” or “high” areas (so you don’t have a good idea of true exposure)
General Air monitoring stations can…?
• Air monitoring stations can provide ecological data but is often not sited randomly (or placed in the most useful spots)
4 Types of Environmental Modeling
Dispersion Models
Interpolation Models
Land Use Regression
Kriging
What is a dispersion model?
model the movement of a pollutant in a media (air, ground water) based upon the physical properties of the pollutant and the media
• Model derived estimates can be used to replace actual sampling
• Usually used when there is a point source for exposures (i.e. how might a contaminant move through the water supply?)
• Often yields ecological type data for a given area such as a zip code or a census tract
• Dependent on the assumptions of the dispersion model
• Models need to be validated
• Can reduce the number of required samples but still require samples (to validate and prove your model works)
What is an interpolation model?
Estimate exposures at non-sampled locations of interest based upon available environmental samples (model exposure at an intermediate point between two points)
• Often used for estimating urban air pollution at given locations of interest (i.e. study subjects home)
• Dependent on the assumptions in the model
• Can use available monitoring data or structured planned monitoring across a target area (city or region)
What is land use regression?
uses land use data such as highway proximity, pollution point sources, commercial land area, green space and tree cover to determine pollution levels at given points of interest
• The regression model is built and validated using known pollution levels from environmental sampling sites (have monitors set up to establish that the model works)
• The regression is then applied to estimate the pollution levels at non-sampled sites
• NYC has an annual air pollution monitoring program and has built LUR models for the entire city (NYCASS program)
What is kriging?
- Uses the special auto-correlation of data from sampled points to estimate the data values at non-sampled points.
- Have used google street view to visualize systematically sampled blocks of cities, and have measured the physical disorder on each block. Then used kriging to estimate physical disorder at any location in the city (urban decay is correlated, so use that spatial correlation to determine a given point based upon its surrounding points’ results).
Nuances of using Questionnaires to determine exposure
- Can suffer from recall bias and foggy memory
- Often yield exposure indexes or scales, but rarely actual exposure levels (high, medium, low)
- Is also a challenge to measure things people don’t see – i.e. stealth exposures
Nuances of using job records to determine exposure
- Can provide long term exposure info and when combined with monitoring data can provide measures of cumulative exposure→lead to job-exposure matrixes in occupational studies
- Job-exposure matrixes take a lot of time and effort to make
What is a biomarker?
an analytical target measured in a biological media. Allows for quantitative data out of biological assays
Biomarker Pros (3)
- Can provide information on internal dose and biologically effective dose
- Not subject to recall bias, and if well done, not subject to information bias at all
- Storage repositories allow for great efficiency and multiple studies
Biomarker Cons (7)
- Missing samples can cause selection bias (i.e. did people decline participation due to not wanting to give a sample, and do those people differ from those who agree to give a sample?
- Few biomarkers can provide information on historical exposure
- Half life is important to consider and can range from 10 years to two days
- May be adversely affected by disease (case-control and cross-sectional studies); Must be careful of causal chain – is the biomarker impacting the disease or is the disease impacting the biomarker?
- May be adversely affected by long-term storage (nested case-control studies)
- Must be very careful in collection, handling, and storage – consistency is key, and must be able to keep samples viable
- Must be very careful in analysis to make sure batch effects don’t bias data
What is the Exposome?
- A proposal by Christopher Wild to develop technology to measure all environmental exposures that a person encounters across the life span (if a genome is all your genes, the exposome is all your exposures)
- A reaction to the growth of genomic and other ‘omic technologies at the expense of understanding the role of environmental exposures
What might exposome technologies include?
Its not clear, but maybe includes:
• Metabol-omics: measurement of all metabolites of all the chemicals a person was exposed to (all chemicals leave a metabolic trace in body)
• Expression-omics – measures of changes in protein expression as signatures of responses to exposures
• Adduct-omics – measurement of all species of DNA or protein adducts present in a biological sample (i.e. afflatoxin adducts)
• Life-logging and passive monitoring technologies (gps, passive enfornmental units)
Exposome Critique
many proposed technologies focus on the internal dose, biologically effective dose, or response to a dose and are conceptually too far removed from actual exposure.
Nuances of Complex Mixtures Studies
- Few environmental exposures occur in isolation → exposed to complex mixtures of chemicals where exposure levels across mixtures are correlated
- One exposure at a time regression isn’t sufficient, but entering multiple exposure variables can lead to issues of multi-collinearity
- How to analyze complex mixtures is a growing field; trying LASSO, variable selection, and machine learning to identify “bad actors” in mixtures.
- Use cluster and principal component analysis to identify salient patterns of exposure/see if they impact outcomes.
What is an Enviornment Wide Association Study?
- Patel et al
- Used NHANES data
- Looked at 266 environmental factors measured in blood/urine
- Ran a separate logistic regression model for each factor as a predictor of Type II diabetes status; included nutritional factors in analyses as environmental factors
- Any time a chemical was associated with type 2 diabetes in 2+ NHANES cohorts, they agreed it was validated and an exposure that mattered.
- Found 2 chemicals associated with type II dabetes by doing hypothesis-free scope evaluation
Cohort Studies - what are?
o Cohort Studies: find a bunch of people with an exposure, and follow them forward in time to see if disease occurs.
Cohort Studies Strengths
- Easier to confirm exposure as being prior to disease (temporality is maintained)
- No selection bias
- Can focus on rare exposures such as in occupational cohorts
Cohort Studies Weaknesses
- Not efficient for rare diseases (and many environmentally related diseases are rare)
- Must collect exposure data on many subjects who remain healthy, and have massive data collection processes
- Biomarker Studies can require vast resources.
5 nuances of occupational cohorts
o Goes to where the exposures are – occupational exposures often have exposure levels that are orders of magnitude higher than ambient exposures
o Often there are industrial hygiene records that provide some data on exposure (database on site)
o Exposure assessment can be time consuming, and job/exposure matrixes require larege amounts of data
o External control group uses expected rates in the general population as a comparison; healthy worker effect can be an issue
o Internal control group compares the lowest exposure group to highest exposure group; often no non-exposed group on site and aspects of the healthy worker effect can still operate
Case-Control Strengths
- Efficient for studying rare disease
- Potential for more in-depth exposure assessment
- Fast to conduct
Purpose of Controls
• Purpose of controls: to estimate the prevalence of exposure in the source population that generated the cases
Control Selection
- Don’t need to match on everything – just need to match on referral patterns
- Need people who represent the underlying population of interest… i.e. “if they had the disease, would they be in my study?”
Case-Control Weaknesses
- Control Selection is challenging (make sure you agree with assessment that it is a case control study – i.e. ensure u think control group is good!)
- Not sufficient for studying rare exposures
- Exposure assessment is retrospective and often questionnaire based
- Few biomarkers are able to reflect historical exposures (hard to prove temporality)
o Case-Case or Case Series Studies Strengths:
o No controls
o More precise estimates of gene-environment interactions
o Efficient for studying rare diseases
o Potential for more in-depth exposure assessment
o Relatively fast to conduct
Case-Case or Case Series Studies Weaknesses
o Requires more detailed hypotheses
o Requires some means to define two case groups
o Odds ratios are more difficult to interpret than those generated from a case-control study
o Not efficient for studying rare exposures
o Exposure assessment is retrospective and usually questionnaire based
o Few biomarkers able to reflect historical exposures
Nested Case-control studies
o Optimizes strengths/alleviates weaknesses of case-control and cohort studies.
Cross Sectional Study Strengths
- Less time consuming and costly
* Can be used to screen many exposures and diseases at same time
Cross Sectional study weakness
causal pathway is less clear bc temporality can’t be established
despite causal limitations our understanding of one of the most important disease-environment relationships came out of this (lead/developmental effects)
Ecological Study: Multi-Group/Cross Sectional Ecological Studies Strengths
• Strengths
o Easy access to large amounts of aggregate data
o Cheap
o Fast
o Can identify areas for more intensive investigation and generate hypotheses
Ecological Study: Multi-Group/Cross Sectional Ecological Studies Weaknesses
o Ecological bias
o Difficulty in controlling for confounding
o Cheap cost and speed can lead to issues of multiple comparisons
o Despite weaknesses ecological air studies provided data that influenced current regulations
Ecological Study: Multi-group/cross-sectional studies look at how
– how does rate of disease in area compare with exposure in an area
Ecological Study: Time trend studies look at how
how does exposure variation impact health variation over time? i.e. exposure day 1 impact disease outcome day 2?
Ecological Study: Time trend strengths
o Can make use of readily available data sets
o Each subject or group uses itself at another time as control
o Good for acute effects
Ecological Study: Time trend weaknesses
o Many trends co-vary (such as pollution levels and weather conditions)
o Not good for diseases with long latency periods
o Ecological Bias
o Cyclic things clan blur results
Ecological Study: Cluster Study Strenghts
o Potential for high exposures, particularly in occupational clusters
o Can be fast and cheap to conduct
o Often identifies area for further analytic study
Ecological Study: Cluster Study Weaknesses
o Texas-sharpshooter effect
o Selection of comparison group can be challenging
o Random distributions can just be lumpy
o Hidden multiple comparisons
Difference between a cluster and an epidemic?
An epidemic is a cluster that epidemiologists take seriously
Cluster –> unvalidated –> common
Epidemic –> validated cluster –> rare
Normal Questions to ask during stage one of a cancer investigation
o Rare vs common cancer? o Age at death? o Same types of cancer? o Familial Relation? o Length of time together? o Is it really cancer?
3 local cluster investigatons
- Breast Cancer on UES
- Lung Cancer on Staten Island
- Breast Cancer on Long Island
Key events in practice of cluster investigation
o 1989: National Conference on Clustering of Health Events
o 1990: Conference proceedings published in AJE
o 1990: CDC published guidelines for investigating clusters (“how to cookbook”)
o 2007: CDC addendum on recent experiences
o 2010: Meeting of the Council of State and Territorial Epidemiologists and CDC to update 1990 guidelines
o 2013: Updated CDC Guidelines Published
Traditional Cluster Definition
o Traditional Definition: perceived or real, greater than expected number of cases for a given set of time and/or space coordinates. Clusters occur in space, time, or time & space
Rothman Cluster Definition
o Rothman’s Definition (1990): All epidemiologists investigate clusters, but cases should be clustered around etiological factors. Distinguishes between etiologically relevant clusters (i.e. familial disease) and time/space clusters.
CDC Cluster Definition
o 2013 CDC Definition: a greater than expected number of cancer cases that occur within a group of people in a geographic area over a period of time.
• very cancer focused, as most clusters have to do with cancer, birth defects, or spontaneous abortion
• backs away from Rothman’s POV.
5 reasons that clusters are of interest
o 1) the hope that cases have clustered in time and or space bc of a common etiologic element that can be pinpointed….we want to identify the cause of the disease
o 2) Can define a new exposure route for an established cause (i.e. we know asbestos causes mesothelioma, so if we see much mesothelioma, we know to look for how people are being exposed to asbestos)
o 3) They generate publicity and fear in the environment → this must be addressed
o 4) Can drive policy debates (if you do find a cause, you need to decide what to do about it)
o 5) They’re a high volume issue… a 2000 paper estimated that there were 1100 reports of cancer clusters to PH officials in 1997.
Seven examples of successful cluster investigations
o Osteosarcoma in watch dial painters
o Three cases of angiosarcoma in one vinyl chloride plant
o Investigation of mesothelioma in small town in Turkey
o Investigation of mesothelioma in NJ town
o Cases of clear cell carcinoma of the vagina
o Outbreak of disease among attendees at an American legion meeting
o Cases of oral cancer among rural women in southern us
Name the 3 types of clusters
o Reported or Perceived
o Validated Cluster
o Etiologic Cluster
CDC Cluster Investigation Step 1
o Step 1: Initial contact and response
• Collect information from the people or groups reporting the cluster, provide education back to the person
• Most investigations stop at this stage
CDC Cluster Investigation Step 2
o Step 2: Evaluation
• 2a: Preliminary Evaluation
• Computer based evaluation; a quick, rough estimate of the likelihood that an important excess has occurred
• 2b: Case evaluation:
• Verify diagnoses; get out into the field and verify that the reported disease is actually really the disease (double counting, benign lumped in with sarcomas).
• 2c: Occurrence Evaluation
• Design and perform a thorough investigation to determine if an excess has occurred and to describe the epidemiologic characteristics
CDC Cluster Investigation Step 3
o Step 3: Major feasibility study
• Determine whether an epidemiologic study linking the disease & suggested exposure is even feasible…is it even possible?