Midterm Study Flashcards
In what case does prevalence approximate to (IR)(Disease Duration)?
When population is in steady state, and when incidence rate and disease duration are stable, prevalence/1-prevalence is equal to this. When prevalence is less than 0.1, you don’t need to divide it.
When is risk (incidence proportion) equal to incidence rate x time?
When incidence proportion is less than 20%, assuming incidence rate does not change over time.
Does cumulative incidence, or the incidence proportion, have units?
No, proportions do not have units.
What strength does incidence rate have over cumulative incidence?
Incidence rate accounts for the fact that population changes over time, by using person-time as the denominator.
What is the limitation of incidence rate and person-time?
Every unit is treated as equal risk, when in reality that’s not the case. Different people actually have different levels of risk at different periods of time.
Prevalence proportion communicates the ____________ rather than the cause.
burden of disease
How is disease survival confounded by time?
Survival can be changed artificially depending on the point at which diagnosis occurs during the span of the disease progression (preclinical, clinical, etc). This is called lead-time bias.
How does simultaneous testing affect sensitivity?
Using two tests simultaneously
increases net sensitivity relative to either test on its own. We use this for diseases we really don’t want people to have. It decreases false negatives and lowers specificity.
Why do we want to use sequential testing?
To decrease false positives! Often used for screening then diagnosis. Has a higher specificity.
PPV increases as _______ goes up.
disease prevalence
How do we determine validity of a new test?
We compare it to that of the existing gold standard, or an existing but more invasive test.
A low threshold raises sensitivity but lowers specificity (T/F).
True!
What are the formulas for net sensitivity and net specificity in sequential testing?
a2 / a1+c1 = (TP / TP + FN) = Positive on both / Real Cases
(d1+d2) / (b1+d1) = (All TN / FP1+TN1) = Negative on either / Non-Cases
What is the formula for net sensitivity in simultaneous testing?
(TP1+TP2-TPboth) / Cases = Positive on Either / Real Cases
TP = Cases x Sensitivity
TP both = Cases x Sensitivity 1 x Sensitivity 2
What are the formulas for net specificity in simultaneous testing?
TNboth / non-cases = Negative on Both / Non-Cases
TN both = Cases x Specificity1 x Specificity2
What is the point of Rothman’s sufficient component cause model?
It’s a good way to conceptualize causation because it acknowledges that different component causes can work together to produce death.
Also, the strength of one cause depends on the distribution of other component causes in the population. A strong cause is a prevalent cause!
How can population attributable factors add up to greater than 100%?
Diseases can have multiple risk factors that are present in one case at a time.
Latent period is the interval between what and what?
It is the interval between the action of the final component cause and the detection of disease. At this point, induction time is zero.
What is the counterfactual model?
It is an attempt to observe what would happen in a hypothetical situation, counter to fact.
Treatment A has a causal effect on an individual’s outcome Y if: YA1≠ YA0
But we cannot examine the entire factual or counterfactual population, so we develop two exchangable groups such that we can assume any difference between the outcomes is due to the treatment. This makes both groups counterfactual.
The validity of our estimates depend upon the validity of the substitution.
All etiologic studies should be designed to estimate _______.
causal contrasts
Cohort studies are intended to estimate causal effects while case-control studies can only estimate associations (T/F)
False! They are all INTENDED to estimate causal effects, through measures of association.
Case-control studies only permit estimation of relative effect measures while cohort studies permit estimation of absolute and relative effect measures (T/F).
True! Absolute measures like risk and rate difference provide a direct estimate of the difference in risk or rate between exposed and unexposed groups. Relative measures, like all the ratios, indicate level of association between exposures.
Which study design gives you best approximation of causal contrast?
RCT effect measures will approximate population causal contrast if randomization was successful and if participants adhere to their assigned treatment and if there was no informative loss to follow-up.
What is the difference between corrected and non-corrected percent agreement?
Percent agreement is True Positive + True Negative / Total Cases. The corrected agreement is TP / Everything but TN.
What are the two prerequisites for an RCT?
Equipoise (a genuine uncertainty about which intervention is superior) and exposures that can be modified by the investigators.
What do investigators have to think of when selecting study participants?
Internal Validity (Exclude those not at risk + Contraindications + Challenges w/ follow-up)
Exposure Validity (Exchangeability? It can be lowered by poor internal validity)
Alpha and Power Levels
Type 1 error is rejecting the null when it’s true and Type 2 error is accepting the null when it’s false (T/F).
True!
What is the difference between simple and block randomization?
Simple randomization is like assigning random numbers, coin tossing, etc. With smaller sample sizes you can get unequal groups. Block randomization is better for smaller studies where you need equal groups; it divides participants equally into each group in each block.
When is stratified randomization not possible?
When there are too many strata or the strata are not measurable.
Compare ITT and per-protocol analysis?
ITT includes all participants in the groups to which they were initially assigned, regardless of whether they completed the treatment as per protocol, dropped out, or deviated from the study. This keeps randomization intact and reflects adherence levels that you might actually observe in real life. Evaluates overall effectiveness not internal efficacy.
Participants who dropped out, missed doses, or deviated from the protocol are excluded from per-protocol analysis which lets us evaluate biological efficacy without dilution from non-adherent participants. However, it can disrupt randomization with selection bias.
What are factorial studies? Pros and cons?
Factorial studies have participants who are both in the exposed group of one exposure and the control group of another, to study two different exposures. This is more efficient in terms of time and cost, but we need a large sample size.
From cohort studies, we determine __________ or ________ to describe the effects of the exposure.
risk differences; risk ratios
What is the main limitation of retrospective studies?
We lose control over data collection and the design of measured variables.
Open cohorts can grow or decrease over time (T/F).
False, they tend to stay steady. Closed cohorts tend to decrease over time due to death and loss to follow up, but they’re easier to keep track of.
What makes absolute measures more accurate than relative?
When incidence is very small, risk ratios tend to be very large and not useful for interpretation. But they can still be calculated.
What is the biggest challenge for case-control studies? What do they measure?
Sampling control is the biggest challenge. These studies cannot determine risk because they don’t look at the total population at risk. But by comparing the exposed to the non-exposed (1:4 ratio for rare outcomes), we can get an odds ratio.
For a case-control study, how do we analyze an open cohort which is not in steady state?
The poor man’s way is to just to assume the changes in the population are linear, and use the midpoint to assume the average person-time of the population.
The better way is density sampling, which will allow the odds ratio to approximate the rate ratio of a cohort study because the distributions of person-time are the same.
What does the odds ratio approximate in a case-crossover study?
It can approximate the incidence rate ratio because we are looking at person-time of case periods and control periods.
What are the three ways we can sample a control group for a case-control study?
Exclusive sampling draws from residual non-cases. This method is intended to reflect the risk ratio of a closed cohort study but it’s biased (overestimation) because it ignores that exposure IS correlated with disease. This problem is often avoided when the outcome is rare.
Inclusive sampling is similar to cohort studies because it draws from all participants at the very beginning, also intended to reflect risk ratio. The exposure distribution should accurately reflect the source population. It’s okay if controls become cases.
Density sampling samples controls throughout, at the same time each case occurs. This odds ratio will approximate the rate ratio of a cohort study (person-time). A person can be counted in the analysis as a case and a control, and they can serve as a control for multiple cases if they match the exposure time. The only tough thing about this is getting reliable exposure time data for the cohort.
Name one pro and one limitation of case-crossover studies?
Case and control periods are exchangeable in terms of fixed personal traits that are often very difficult to measure. We control for individual characteristics which don’t change over time, and which may otherwise confound.
But we’re limited by the assumption that the exposure doesn’t have a cumulative effect and also limited to very specific types of research questions.
Why aren’t cross-sectional studies good for examining relationships?
They only use information from a single point in time, and they capture prevalence rather than incidence (different numerators).
What are the absolute and relative measures we can get from a cross-sectional study?
Prevalence Difference (PD), Prevalence Ratio (PR), and Prevalence Odds Ratio (P/1-P) which we actually don’t use.
Which study design is a good starting point for new hypotheses and useful for descriptive epidemiology, but cannot establish temporal relationships or incidence rates?
Cross-sectional!
Why would we repeat cross-sectional studies in a population?
To measure the impact of public health interventions and examine trends over time, though they usually won’t measure the same people over time.
Ecological studies are especially useful when there is not much variation _____________, but high variation _______________.
within populations;
between populations
Describe the concept of ecological fallacy and aggregation bias.
Ecological fallacy is when inferences are incorrectly made about individuals, based on group-level data. Group level trends don’t always apply at the individual level.
Aggregation bias is a synonym of ecological fallacy.
What are the three types of ecological variables?
Aggregate variables summarize the characteristics of individuals within a group.
Environmental measures are physical
characteristics of the place in which members of a group live or work. Exposure levels are assumed to be the same across the group, even if that isn’t necessarily accurate.
Global measures represent characteristics of the group that are not reducible to characteristics of individuals.
The target population of a study will be located in the same place as the study participants (T/F).
False, results can often be extrapolated to similar populations across the world.
When do we use cluster randomization?
When interventions cannot be limited to individuals and when outcomes are not independent of others’ treatment status.
What are the three main categories of systematic error?
Confounding, selection bias, and information bias
What is the difference between stepwise and change-in-estimate as methods for identifying confounders?
Stepwise tests associations between variables and the outcome, then adjusts for whichever are statistically significant.
Change-in-estimate adjusts for whichever variables alter the RR, RD, or OR by 10% or more.
What are the three signs of a confounder?
Must be associated with the exposure, a causal risk factor for the outcome, and not a mediating variable between the two.
In a DAG, causes always precede effects (T/F).
True, so there are never circles.
We control for the maximum set of variables needed to identify an unconfounded effect (T/F).
False, the minimum to avoid overcontrolling and bias introduction.
What happens when you condition on a collider?
A statistical association between the variables leading into it will be induced, opening a new backdoor path through those variables.
What letter can we use to represent unmeasured or unknown confounders?
U
How do we use DAGs to determine the magnitude of the bias?
We cannot. Magnitude and direction are not shown by DAGs.
What are the three ways to control for confounders?
Randomization, Restriction, Matching
What is the difference between stratified and covariate adaptive randomization and which is more effective at reducing systemic error?
They both assign participants to groups based on pre-determined balance. However, imperfect randomization with respect to measured and unmeasured variables means neither are 100% effective.
What is the downside to restriction as a method against confounding?
Lowers generalizability and extends the recruitment process.
What is the downside to matching distributions as a method against confounding?
Can introduce selection bias unless we’re matching on time (density sampling). Typically not worth it.
When does residual confounding occur?
When the defined categories are so broad that variation within them still confounds.
Why should p-values not be the only measure of a study’s findings?
P-values do not take into account the magnitude of the observed effect.
How do we report crude and stratum-specific estimates if they differ?
Report the different numbers or pool the stratum specific estimates to get a summary statistic which accounts for some of the confounding.
Mantel-Haenszel equation
Selection bias directly affects external validity (T/F).
False, it directly affects internal which influences external.
How can conditioning on a variable create selection bias?
Conditioning on a collider (common effect of two other variables) will induce an association between those variables which may not actually exist in the source population.
Describe selection bias in case-control studies using Berkson’s bias.
Choosing controls not independent of cases will introduce selection bias. Berkson’s bias is an example where all participants are drawn from hospitalized patients and the ultimate association is just due to confounders within the hospital.
What are the three main ways selection bias is introduced in RCTs?
Self-selection, lack of allocation concealment during recruitment, and differential loss to follow up.
The healthy worker effect is an example of what kind of bias?
Selection bias.
Name three ways to deal with selection bias.
Effective allocation concealment, minimized losses to follow-up, and independent selection of controls in case-control studies.
When do we use sensitivity analyses?
To assess how robust results are to assumptions regarding bias. It gives us adjusted results based on the magnitude of bias we assume is present.
Name two reliability tests.
Inter-rater and test-retest
Non-differential misclassification of a binary variable usually pulls effect measures ____________ the null.
toward
Differential misclassification happens when the __________ and ___________ differs between assigned groups.
sensitivity; specificity
How does information bias impact exposure measurement vs. outcome measurement?
Misclassification of exposure is usually nondifferential in cohort studies and RCTs, because it’s measured independently of the outcome. In case control studies, though, outcome status can affect exposure ascertainment in differential/nondifferential ways.
For outcome measurement, it’s the opposite and differential misclassification is a lot more common in cohort studies.
Poor recall is something we can assume to be nondifferential (T/F).
True, recall bias is different and happens when accuracy differs between exposure groups in case-control studies.
When does interviewer bias occur?
When those investigators are not blinded to the outcome status of participants and then treat/question people differently based on it.
How do we combat recall and interviewer bias when measuring exposures?
Don’t rely on self reports! Verify them or use existing cohorts with recorded baseline data.
Blind interviewers, standardize data collection procedures, and ask participants to complete electronic surveys.
What are the three types of information bias that can happen around outcome measurement?
Observer bias (blind them / use multiple), surveillance bias (when you’re more thorough with one group than another), and reporting bias (like recall bias).
Not all research should be done (T/F).
False, some questions cannot be answered well with the data available to you, so be thoughtful in the design of research.
What are some reasons an association can be observed between an exposure and an outcome?
Chance, reverse causality, true causation, or confounding