Epi Flashcards
In a case-control study to investigate atresia coli in cattle, it was found that Holstein-Friesian calves were significantly more likely to have atresia coli than all other breeds combined.
a) What statistical test was used to make this determination?
b) What is the null hypothesis for this particular test (you can use mathematical notation or English)?
a) Chi-square
b) No association between breed of cattle and atresia coli.
Women who had been exposed to a pesticide, DDE, were followed for 20 years. At the start of the study period, the women completed a questionnaire and had blood drawn, and the women were classified as having either low dose or high dose exposure. Of the 792 women who had high dose exposure to DDE, 430 were later diagnosed with breast cancer. Of the 3,525 women with low dose exposure to DDE, 1,079 were later diagnosed with breast cancer.
a) What type of study design is this?
b) What is the cumulative incidence of breast cancer for those exposed to high doses of DDE?
c) What measure of association is appropriate to calculate for this study design?
a) Cohort study
b ) 430 / 792 = 54%
c) Relative risk (RR)
Women who had been exposed to a pesticide, DDE, were followed for 20 years. At the start of the study period, the women completed a questionnaire and had blood drawn, and the women were classified as having either low dose or high dose exposure. Of the 792 women who had high dose exposure to DDE, 430 were later diagnosed with breast cancer. Of the 3,525 women with low dose exposure to DDE, 1,079 were later diagnosed with breast cancer.
d) Please calculate the Risk Ratio. Please show all calculations, including the 2-by-2 table.
e) Interpret the calculated measure of association.
f) Please calculate the attributable risk percent. You will remember that the attributable risk percent (attributable fraction among the exposed) is calculated by
(Riskexposed – Riskunexposed) / Riskexposed
which can be reduced to
(RR – 1) / RR or (OR – 1) / OR
g) Interpret the attributable risk percent.
d)
Present Absent Total
High dose ex 430 362 792
Low dose ex 1079 2446 3525
1509 2808
Risk
430/792 = 0.54
1079/3525 = 0.31
RR = 0.54/0.31 = 1.77
e) Women who had high dose exposure to DDE had 1.77 times the risk of developing breast cancer than did women who had low dose exposure to DDE.
f) (1.77 – 1) / 1.77 = 43.5%
g) If high dose exposure to DDE were prevented in the group of women exposed to high dose DDE exposure, we would prevent at most 43.5% of the breast cancer cases in that group.
Mounts et al. conducted a study to determine risk factors for avian influenza A (H5N1) disease in humans in the 1997 Hong Kong outbreak (J Infect Dis. 1999 Aug;180(2):505-8.).
In May 1997, a 3-year-old boy in Hong Kong died of a respiratory illness related to influenza A (H5N1) virus infection, the first known human case of disease from this virus. An additional 17 cases followed in November and December. A ___________ study of 15 of these patients hospitalized for influenza A (H5N1) disease was conducted using controls matched by age, sex, and neighborhood to determine risk factors for disease. Exposure to live poultry (by visiting either a retail poultry stall or a market selling live poultry) in the week before illness began was significantly associated with H5N1 disease (64% of cases vs. 29% of controls, odds ratio, 4.5, P=.045). By contrast, travel, eating or preparing poultry products, recent exposure to persons with respiratory illness, including persons with known influenza A (H5N1) infection, were not associated with H5N1 disease.
a) What study design was used by the investigators?
b) Why did the investigators match the controls by age, sex and neighborhood?
c) Interpret the p-value for the association between exposure to live poultry and infection with H5N1
d) The authors report the p-value, but not the confidence interval. In this study would you expect the confidence interval to be narrow or wide, and why?
a) Case-control
b) To control for potential confounding by those variables.
c) If there really is no association between exposure to live poultry and H5N1 infection in humans, then we would expect to see an odds ratio as big as 4.5 due to chance alone 4.5% of the time. Thus, chance is unlikely to account for our findings and we can say the result is statistically significant.
d) Wide, because the sample size is very small (15 cases).
The following abstract is from a study conducted by Brooker et al. to investigate hookworm infection, anemia and iron deficiency. (Trans R Soc Trop Med Hyg. 2006 Oct 4).
Surprisingly few detailed age-stratified data exist on the epidemiology of hookworm and iron status, especially in Latin America. We present data from a ________________ examining 1332 individuals aged 0-86 years from a community in south-east Brazil for hookworm, anemia and iron deficiency. Sixty-eight percent of individuals were infected with the human hookworm Necator americanus. Individuals from poorer households had significantly higher prevalence and intensity of infection than individuals from better-off households. The prevalence of anemia, iron deficiency and iron-deficiency anemia was 11.8%, 12.7% and 4.3%, respectively. Anemia was most prevalent among young children and the elderly. Univariate analysis showed that hemoglobin and serum ferritin were both significantly negatively associated with hookworm intensity among both school-aged children and adults. Multivariate analysis showed that, after controlling for socio-economic status, iron indicators were significantly associated with heavy hookworm infection. Our results indicate that, even in areas where there is a low overall prevalence of anemia, hookworm can still have an important impact on host iron status, especially in school-aged children and the elderly.
a) What study design is used by the investigators?
b) What is one limitation of this study design?
a) Cross-sectional
b) Can’t determine temporal relationship between potential risk factors and the outcome. Thus, you can’t proceed to assess whether any identified associations are causal.
A case-control study was conducted to examine the association between oral contraceptive (OC) use and myocardial infarction.
Myocardial infarction Controls
Did use OCs 39 24
Did not use OCs 114 154
a) What measure of association is appropriate to calculate for this study design?
b) Please calculate that measure of association. Please show all calculations.
c) The investigators were concerned about potential confounding or effect modification by age. What conditions must be met in order for a variable to be a confounder?
a) Odds ratio (OR)
b) OR = ad / bc = (39 * 154) / (24 * 114) = 2.2
c) Must be associated with both the risk factor and the outcome, but not on the causal pathway from risk factor to outcome.
A case-control study was conducted to examine the association between oral contraceptive (OC) use and myocardial infarction.
A stratified analysis was conducted to assess whether age was a confounder or effect modifier.
Age < 40
Myocardial infarction Controls
Did use OCs 21 17
Did not use OCs 26 59
OR=2.8
Age ≥ 40
Myocardial infarction Controls
Did use OCs 18 7
Did not use OCs 88 95
OR=2.8
a) Is age an effect modifier? Why or why not?
b) A Mantel-Haenszel measure of association was determined to be 2.74. Is age a confounder? Why or why not?
c) Which measure(s) of association do you report (please give the numeric value(s))?
d) Interpret your results.
a) Age is NOT an effect modifier because the stratum-specific ORs are the same and therefore the association between OC use and MI is NOT modified by the third variable, age.
b) Yes, age is a confounder because 2.74 is appreciably different than 2.2. Thus, age confounds the true relationship between OC use and MI.
c) Report the ORMH = 2.74
d) Women who have had an MI have 2.74 times the odds of having used OCs than women who have not had an MI, controlling for the effects of age.
What are three ways to control for confounding in epidemiologic studies?
Randomization Restriction Matching Stratification Multivariate analysis
The following question is on a questionnaire designed to investigate the effect of coffee consumption on cardiovascular disease.
How much coffee do you drink?
i. 1 cup
ii. 2-3 cups
iii. 3-5 cups
iv. More than 5 cups
a) Describe two things that are wrong with this question and its available answers.
b) Re-write the question to correct the issues you identified.
a) No timeframe (per day, per week, etc.)
Not all possible options for answers are listed (“none” is not an option, etc.)
Categories are not mutually exclusive
“Cup” not defined
b) How many 8 oz. cups of coffee do you drink per day?
a. None
b. 1 cup
c. >1 up to and including 3 cups
d. > 3 up to and including 5 cups
e. More than 5 cups
What is the ecologic fallacy?
Ascribing characteristics and associations demonstrated at the group level to individuals
Fifteen hundred adult males working for Lockheed Aircraft were first examined in 1951 and were classified by diagnosis criteria for coronary artery disease. Every 3 years they were examined for new cases of this disease; attack rates in different subgroups were computed annually. This is an example of a
a) Cross-sectional study
b) Prospective cohort study
c) Retrospective cohort study
d) Ecologic study
e) Case-control study
b) Prospective cohort study
Which of the following is not an advantage of a prospective cohort study?
a) Incidence rates can be calculated
b) Precise measurement of exposure is possible
c) Recall bias is minimized compared with a case-control study
d) Many disease outcomes can be studied simultaneously
e) It usually costs less than a case-control study
e) It usually costs less than a case-control study
One hundred patients with infectious hepatitis and 100 matched neighborhood well controls were questioned regarding a history of eating raw calms or oysters within the preceding 3 months. What kind of study design is this?
a) Cross-sectional study
b) Prospective cohort study
c) Retrospective cohort study
d) Ecologic study
e) Case-control study
e) Case-control study
All of the following are important criteria when making causal inferences except:
a) Replication of findings
b) Temporal relationship
c) Null hypothesis
d) Strength of association
e) Biologic plausibility
c) Null hypothesis
Geographic variations were determined in the incidence of inflammatory bowel disease (IBD). Incidence of IBD was observed highest in areas with higher socioeconomic status, the lowest rates of enteric infection, and with the highest rates of multiple sclerosis. This is an example of a
a) Cross-sectional study
b) Prospective cohort study
c) Retrospective cohort study
d) Ecologic study
e) Case-control study
d) Ecologic study
A case-control study is characterized by all of the following except:
a) Study participants are selected based on disease status
b) Assessment of past exposure may be biased
c) It is relatively inexpensive compared with most other epidemiologic study designs
d) Incidence rates may be computed directly
e) Definition of cases may be difficult
d) Incidence rates may be computed directly
Define sensitivity and write the calculation.
a test’s ability to designate an individual with disease as positive
Sn=True Positives (TP)/Total with Disease (TD)
Define Specificity and write the calculation.
Specificity is the proportion of people WITHOUT Disease X that have a NEGATIVE blood test. A test that is 100% specific there are no false positives.
Sp=Test Negatives (TN)/ Total without disease
Define PPV and write the calculation.
The positive predictive value is the probability that following a positive test result, that individual will truly have that specific disease.
PPV=Test Positives/Total Positives
Define NPV and write the calculation.
The likelihood that an individual with a negative test result is truly unaffected
NPV=Test negatives/Total Negatives
You are evaluating a new diagnostic test by comparing it to a gold standard.
Gold Standard
Positive Negative New Positive 260 95
Negative 65 1640
Total. 325 1735
For the following questions, please show all calculations.
a. What is the sensitivity of the test?
b. What is the specificity of the test?
c. What is the predictive value positive of the test?
d. What is the predictive value negative of the test?
e. What is the prevalence of disease in this example (based on the results of the gold standard test)?
f. What happens to the predictive value positive if the prevalence decreases?
a) Sensitivity = TP / Total with disease = 260 / 325 = 80%
b) Specificity = TN / Total without disease = 1640 / 1735 = 95%
c) PVP = TP / Total positives = 260 / 355 = 73%
d) PVN = TN / Total negatives = 1640 / 1705 = 96%
e) 325 / (325 + 1735) = 15.8%
f) If prevalence decreases then PVP decreases
You are investigating risk factors for the development of feline vaccine-associated sarcomas and one of your hypotheses is that vaccination against FeLV is a risk factor for sarcoma development. To test this hypothesis, you identify 30 cats with sarcomas and 60 control cats from hospital records. Of the cats with sarcomas, 23 had a previous FeLV vaccination and in the control group, 17 cats had a previous FeLV vaccination. Note that this is hypothetical data only.
a. What type of study design was used here?
b. Set up a 2x2 table for the study and calculate the odds ratio and the relative risk for development of feline vaccine-associated sarcoma after vaccination against FeLV.
c. What is your best estimate of risk for vaccine-associated sarcoma after vaccination against FeLV? Please include the numeric value.
d. What is one potential source of bias in this study?
a) Case-control study
b) Sarcoma
Positive Negative Total vac yes 23 17 40
vac no 7 43 50
Total 30 60 90
OR = ad/bc = (2343) / (177) = 989 / 119 = 8.3
RR = (a / h1) / (c / h2) = (23 / 40) / (7 / 50) = 0.575 / 0.14 = 4.1
c) Case-control study so OR estimates risk. OR = 8.3
Cases had 8.3 times the odds of having been vaccinated against FeLV than did controls.
d)
Selection bias –
a) only hospital cats, may not be representative of larger cat population
b) Were control cats old enough to have developed a sarcoma?
Information bias –
a) were records similar for cases and controls?
b) Data abstraction issues
Confounding – e.g., age
BACKGROUND: In May 2003 the Soest County Health Department was informed of an
unusually large number of patients hospitalized with atypical pneumonia. METHODS: In exploratory interviews patients mentioned having visited a farmers’ market where a sheep had lambed. Serologic testing confirmed the diagnosis of Q fever. To investigate risk factors for infection we conducted a ____ A_______ study (cases were Q fever patients, controls were randomly selected Soest citizens) and a _____B______ study among vendors at the market. RESULTS: A total of 299 reported Q fever cases was linked to this outbreak. The [first] study identified close proximity to and stopping for at least a few seconds at the sheep’s pen as
significant risk factors. Vendors within approximately 6 meters of the sheep’s pen were at increased risk for disease compared to those located farther away. The ewe that had lambed as well as 25% of its herd tested positive for C. burnetii antibodies. CONCLUSION: As a consequence of this outbreak, it was recommended that pregnant sheep not be displayed in public during the 3rd trimester and to test animals in petting zoos regularly for C. burnetii.
a) The first study (labeled A) is an example of a
a. Cross-sectional study
b. Prospective cohort study
c. Retrospective cohort study
d. Ecologic study
e. Case-control study
The second study (labeled B) is an example of a
a. Cross-sectional study
b. Prospective cohort study
c. Retrospective cohort study
d. Ecologic study
e. Case-control study
a) e. Case-control study
b) c. Retrospective cohort study
What are advantages of a cross-sectional study?
Used to prove and/or disprove assumptions.
Not costly to perform and does not require a lot of time. Captures a specific point in time. Contains multiple variables at the time of the data snapshot. The data can be used for various types of research.
What are disadvantages to a cross-sectional study?
Cannot be used to analyze behavior over a period to time.
Does not help determine cause and effect. The timing of the snapshot is not guaranteed to be representative. Findings can be flawed or skewed if there is a conflict of interest with the funding source.
What are the advantages of a Prospective cohort study?
They can provide better quality of data on the primary exposure and also on confounding variables
Since exposures are assessed before outcomes occur, they are less prone to bias.
What are the disadvantages of a Prospective cohort study?
They are more expensive and time consuming.
They are not efficient for diseases with long latency.
Losses to follow up can bias the measure of association.
What are the advantages of a Retrospective cohort study?
They are useful for rare exposures, e.g., unusual occupational exposures
They are cheaper and faster than prospective cohort studies
They are more efficient for diseases with a long latency period
What are the disadvantages of a Retrospective cohort study?
Exposure data may be inadequate and there may be inadequate data on confounding factors, such as smoking, alcohol consumption, exercise, other health problems, etc.; old records were not designed to be used for future studies
What are the advantages of an Ecologic study?
The aggregate data used is generally available, so they are quick and inexpensive
They are useful for early exploration of relationships
They can compare phenomena across a wider range of populations and sites.
Some exposures of interest can only be studied with aggregate population level data, such as the effect of smoking bans and rates of heart attacks
What are the disadvantages of an Ecologic study?
Can’t directly link the risk factor to the disease, i.e., it is not clear that the people who ate the most meat were the ones who got colon cancer. This is sometimes referred to as “ecological bias” or the “ecological fallacy.”
No effective way of taking into account, or adjusting for, other factors that influence the outcome (confounding factors).
Ecologic studies can be misleading when evaluating non-linear relationships.
What are the advantages of a Case-control study?
They are efficient for rare diseases or diseases with a long latency period between exposure and disease manifestation.
They are less costly and less time-consuming; they are advantageous when exposure data is expensive or hard to obtain.
They are advantageous when studying dynamic populations in which follow-up is difficult.
What are the disadvantages of a Case-control study?
They are subject to selection bias.
They are inefficient for rare exposures.
Information on exposure is subject to observation bias.
They generally do not allow calculation of incidence (absolute risk).
Epidemiologic models can be useful for all of the following except:
a. Predicting effectiveness of programs
b. Organizing and storing knowledge about a disease process
c. Predicting risk or consequences of disease
d. Identifying an individual’s risk factors for disease
e. Developing policy
d. Identifying an individual’s risk factors for disease
To be effective, surveillance systems should incorporate all of the following except:
a. Generation of information for action
b. Disease eradication
c. Ongoing data collection
d. Systematic data collection
e. Timely information dissemination
b. Disease eradication
Surveillance system design should aim at
a. Maximizing the probability of true early detection
b. Incorporating as many sampling architectures as possible
c. Minimizing the probability of a false-positive alarm
d. All of the above
e. a and c only
e. a and c only
It is important if treatment at the pre-symptomatic stage has a more favorable outcome than treatment initiated once the patient is symptomatic.
T/F?
True
The lead time is defined as the period in the natural history of the disease in which treatment is more effective and/or less difficult to administer.
T/F?
False
The critical point in the natural history of a disease is the point before which treatment is more effective and/or less difficult to administer.
T/F?
True
In order for a screening program to be effective, there does not need to be an accepted treatment for patients identified with the disease
False
If α is our false-positive error rate, or the probability of making a Type I error, and β is our false negative error rate, or the probability of making a Type II error, what is power?
Power is 1 – β, the probability of detecting a difference if one truly exists
For a given α and measure of association, how can an investigator increase the power of a study?
Increase the sample size
Probability sampling
a. Refers to several sampling strategies
b. Allows investigators to generalize the results from the sample to the population
c. Allows calculation of the standard error of the resulting population estimates
d. All of the above
e. a and b only
d. All of the above
A true lack of association may be difficult or impossible to distinguish from a true association that cannot be detected statistically because of inadequate _________.
a. α (alpha)
b. β (beta)
c. Power
d. Detection rates
e. Error rates
c. Power