Problem sets 1-7 Flashcards
What are the advantages of case control studies?
- great in outbreaks, emergencies and resource limited settings
- great for rare diseases
- great for rare exposures and extreme relative risks
- good for rare outcomes
- can explore multiple exposures at once
- can be faster and cheaper than other study designs
- less potential for loss to follow up than cohort studies
What are the disadvantages of case control studies?
- can be difficult to establish temporal relationship
- prone to systematic error, especially recall bias (and other forms of information bias - observer, interviewer, performance)
- prone to issues with selection bias with controls
- can be prone to confounding and reverse causality also
How can you mitigate concerns with confounding in the context of case control studies?
In the design stage, use matching
In the analysis stage, logistic regression (which gives you an adjusted OR score)
How do you correctly word the interpretation of an OR?
The odds of OUTCOME were OR times (higher or lower) in the CASES than the CONTROLS
What is the full formula for an OR?
odds in exposed/odds in unexposed = a/b =c/d = ad/bc =OR!
Outside of a CCS, what can using an OR do (especially for rare diseases)
exaggerate the results
In the context of CCS, what core feature is especially important, and why?
Hypothesis generation, because it defines who will be selected
When cases and controls are very similar, it is good because….
It reduces the risk of confounding (but must be careful for over matching)
What is confounding?
- The distortion of the association between an outcome and an exposure, which occurs when study groups differ with respect to other factors that influence the outcome
- A confounding variable cannot be on the causal pathway be on the pathway between exposure and outcome
What is an ecological study?
- a study when the observations are made on a group level
What is a particular concern in the context of the interpretation of ecological studies?
- the ecological fallacy
In the context of a survival analysis, what do you need to do to get statistically significant information?
- do the log-rank test to make any inferences and to get a p-value
- do a Cox’s regression to get your hazard ratio
What does randomization control for?
Both known and unknown variables
In the context of an ecological study, what further analysis would help you draw further conclusions?
The correlation coefficient
Regression would help you identify any associations
Why is it important to present an adjusted OR score?
- Adjusted OR scores adjust for confounding of known variables
- These variables depend on the given study, but typically include sex, age, weight, socio-economic status, racial origin
- Adjusted OR scores are vital to present as they present the odds AFTER adjusting for known confounders
- They increase the validity of results
What is the definition of recall bias?
Recall bias is a systematic error caused by differences in the accuracy or completeness of the recollections retrieved (“recalled”) by study participants regarding events or experiences from the past.
What is a prevention strategy for recall bias?
The collection of information from objective sources, such as medical records
What is heterogeneity?
Difference in results
Why is it important to do a meta analysis?
It is important to do a meta analysis as conclusions cannot be drawn from a single intervention or study, as results tend to differ slightly from study to study
Doing a meta analysis applies objective formulas, which can identify the reasons for variation from study to study
- ALWAYS CONTEXTUALISE your answer
Can meta analysis overcome all forms of systematic error?
No, it cannot
Define randomization
The process by which a participant, who meets the neccessarily selection criteria, has an equal chance of being assigned to either the intervention or control arm of a study. This controls for known and unknown confounders.
Define what an RCT is, and how it is different from other study designs
An RCT is a quanitative, controlled, comparative experiement in which the effect of two or more interventions are studied in a group of participants who are RANDOMLY allocated to either intervention group
What are the two approaches to dealing with participants who are lost to follow up?
per protocol OR intention to treat
What is the definition of a chi-squared test?
The test of association between expected and observed data gathered
What are the advantages of cross-sectional studies?
- guide public health descision-making
- informs the health needs of a population
- initial investigation of ideas
- hypothesis generation
- relatively quick
- less expensive than alternatives
- explore multiple exposures at once
- helpful in the decision-making of allocation of resources
What are the disadvantages of cross-sectional studies?
- not the strongest design to provide strong evidence about the CAUSES of health-related OUTCOMES
- Associations identifies may be hard to interpret
- not suitable for rare exposures
- not suitable for rare diseases
- not suitable for highly fatal diseases
What is a reference standard?
the best test that we currently have available - the “gold standard”
What is an index test?
The test that we are testing to see if it could replace the reference standard
Define sensitivity
The ability of a test to correctly identify the presence of a disease (the proportion of true positives that test postive)
Define sensitivity
The ability of a test to correctly identify the presence of a disease (the proportion of true positives that test positive)
Define specificity
The ability of a test to correctly identify the absence of a disease (the proportion of true negatives that test negative)
Define NPV
The percentage of people who are not diagnosed with the condition who actually DON’T have the condition (or will not develop it)
Define PPV
The percentage of people who are diagnosed with the condition who actually DO have the condition (or will develop it)
Discuss and define the trade-offs associated with changing the cut-off score in the context of diagnostics
You will have to make tradeoffs between sensitivity and specificity; it is just crucial to cover which you are specifying and WHY. Look at the meta analysis table for where the most specific test is for the cut off point.
If you increase the specificity of a test, you decrease the sensitivity
As a general rule - for a diagnostic/confirmatory test you want higher specificity, and for screening tests you want higher sensitivity
Define what a cohort study is
to examine possible relationships between an exposure and an outcome; prospective in nature
Define what a case control study is
to examine possible relationships between an outcome and an exposure (retrospective in nature)
Define what a cross sectional study is
Health information on a population on a particular period in time; generally used for measuring the prevalence of health outcomes or determinants of health in a population.
How are cohort studies and case control studies different?
Cohort studies define participants by exposure, case control studies define them by outcome
What are the advantages of a cross sectional study
- guide public health decision making
- initial exploration of ideas
- cheaper and faster than other study design times
- helpful to decision making in terms of the allocation of resources
- hypothesis generation
- informing health needs of a population
What are the disadvantages of a cross sectional study
- not the best study design for assigning attribution to health relates outcomes
- because E and O are measured at the same time, any associations identified can be hard to interpret
- other prospective study designs that gather data in incident cases are better
- prone to confounding
- not suitable for rare diseases, highly fata diseases, and diseases with a short duration
How do you correctly word the interpretation of an RR which is LESS than 1?
The risk of OUTCOME was % LOWER in the PRIMARY INTERST GROUP than the UNEXPOSED
How do you correctly word the interpretation of an RR which is MORE than 1?
The RISK of OUTCOME was RR times higher in EXPOSED than UNEXPOSED
What is the importance of having a representative sample?
It helps ensure that your results are generalizable to a population and increases the scientific validity of your results
In the context of an RR and OR, what is the null value?
1
What kind of regression is used for categorical data?
Logistic
What kind of regression is used for continuous data?
Linear
What are the core assumptions of a survival analysis?
1) Survival Probabilities are the same for all the samples who joined late in the study and those who have joined early. The Survival analysis which can affect is not assumed to change.
2) Occurrence of Event are done at a specified time.
3) Censoring of the study does not depend on the outcome. The Kaplan Meier method doesn’t depend on the outcome of interest. The censoring is INDEPENDENT of outcome
4) Censoring is similar in all groups
If an OR or RR does not contain 1, then the result is….
Statistically significant!
What are the problems with evidence obtained from individual observational studies?
They do not address unknown confounders
They are liable to several forms of bias, including methods of randomization, criteria for eligibility, issues with blinding, analysis, and handling of protocol deviations all are important considerations surrounding study quality.
They cannot prove causation
What is the formula to calculate sensitivity?
a/a+c
What is the formula to calculate specificity?
b/b+d
What is the formula to calculate PPV?
a/a+b
What is the formula to calculate NPV?
d/c+d
Why can you not ascribe equal weights to each study in a meta analysis?
Because each study was conducted in a particular population, where overall underlaying rates were different from each other, and with different indifidual effects. Some studies are likely to give an answer closer to the “true” effect size
Why can you not ascribe equal weights to each study in a meta analysis?
Because each study was conducted in a particular population, where overall underlaying rates were different from each other, and with different individual effects. Some studies are likely to give an answer closer to the “true” effect size
In meta-analysis, what does weighting depend on?
Number of participants
Number of events
Variance
What are the 3 types of heterogeneity?
- clinical
- methodological
- statistical
In the context of a meta analysis, what does the chi-squared test do to the null hypothesis?
Tests the Ho that all tests are homogenous
The higher the I-squared score, the higher…
…the heterogeneity
What I-squared score is considered important?
0 - 40%, no important heterogeneity
75-100% considerable heterogeneity
What are the advantages of a cohort study?
- follows the natural course of disease
- great for rare exposures
- allows for the estimation of absolute risks, such as incident risks (as opposed to CCS, where you can only do relative risks)
- multiple outcomes from one exposure can be studied
- cause preceeds the effects, an important aspect for ASCRIBIDING CAUSALITY
What are the disadvantages of a cohort study?
- risk of loss to follow up: calls into question the scientific validity of results
- resource and time-intensive
- not appropriate for rare outcomes
- not suitable for diseases with a long latency period
- possible “study affects” eg performance bias with those being studied
What is healthy entrant bias?
People who have the disease of interest, or symptoms indicative of the start of the disease, are often excluded from cohorts at recruitment
Participants are biased towards healthy individuals
Initial incidence of outcome therefore might be lower in the cohort than the general population
The disease experiences of the cohort and that of the general population may well not be comparable
What is non-response bias?
Non-response bias can occur if those who participated in the study were different from those who did not. Important to note that this is about PARTICIPATION and not SELECTION.
What kinds of systematic error are RCTs liable to? Address and explain
Loss to follow up if there is not proper tracing involved. Information bias if: 1) participants not blinded (performance bias) 2) assessors not blinded (observer bias and/or interviewers not blinded). And finally, selection bias (allocation bias) if participants are not properly randomized
What is the prevention strategy for observer bias?
The masking of participants from the identity of exposures
What is observer bias?
when the investigator is aware of the disease status, treatment group or outcome of the subject and their ability to interview the subject, collect or analyse the data in an unbiased manner is compromised
What is reporting bias?
When participants “collaborate” with researchers and give answers in the direction they perceive are of interest
What is interviewer bias?
When an interviewer takes more detailed notes/more attention towards cases than controls in a way that means unequal levels of data have been collected for both groups that calls into question the validity of findings
What is a prevention strategy for sampling bias?
Avoid using volunteers and use rigorous inclusion criteria
What is a prevention strategy for recall bias?
Collection of data from medical records (or another objective source of information)
What are some advantages of categorising an the exposure variable (originally continuous) into a binary variable?
- Simplifies the statistical analysis and leads to easy interpretation and presentation of results
- As long as the cut off is CLINICALLY REASONABLE
What are some disadvantages of categorising an the exposure variable (originally continuous) into a binary variable?
- Some participants in the low risk group could miss out on the potential intervention
- Information could be lost so STATISTICAL POWER to detect a relation between the exposure and outcome variables is reduced
- Dichotomising a variable at the median reduces power by the same amount as would discarding a third of the data
What are the three features of a confounding variable?
A confounding variable is a variable which:
- Does not lay on the causal pathway between E and O
- It is a cause of exposure
- A cause of the outcome of interest
What is the definition of confounding?
A distortion of the association between exposure and outcomes. This occurs when the primary exposure of interest is mixed up with another factor that is associated with the outcome.
What kinds of studies are cohort, case control, ecological and cross-sectional?
OBSERVATIONAL
What kinds of studies are cohort, case control, and cross-sectional?
Observational analytical (bc there is a comparator)
With case control studies, it can be difficult to…
Establish a temporal relationship (sequence of events)
This is because the exposure may be the result of, not cause, of the outcome of interest
Case control studies are useful for…
Establishing an association between outcome and exposure
A core strength of cohort studies is…
…that exposure precedes the outcome, an important aspect for ascribing causality
In cross-sectional studies, associations identified…
….may be hard to interpret
RCTs have the only study design which can…
in principle, directly address causal relationships
What kinds of outcomes are cohort studies not suitable for???
Rare outcomes
What kind of bias are case control studies prone to because of the design
Information bias as the participants are already aware of their health status/conditions
What is the definition of selection bias?
when there is a systematic difference between the characteristics of those selected for the study and those who are not.
In the context of case control studies, randomization decreases….
….sampling bias
What are the advantages of an ecological study design
- Ecological studies are very useful for generating hypothesizes
- They are low-cost and convenient
- They are helpful for indicating where further research is needed
- The analysis and presentation of finings is simple
What further analysis could you use for ecological studies?
- correlation coefficient
- regression coefficient
What would be the null hypothesis in the context of using the chi-squared test?
Ho = there is no association between the classification of the two attributes under investigation
What is the chi-squared test based on?
The difference between the observed and expected frequencies
What does the chi-squared test presume of it’s observations?
That they are independent
What if you are comparing more than one means? Which test would you use
ANOVA (analysis of variance)
What does correlation do?
Measures the strength of linear association between two continuous variables (exposure and outcome), giving us a score between -1 and 1
What is the full name of the cox’s regression?
Proportional hazards regression
What are the assumptions of cox’s regression?
hazards are proportions, all censoring is independent of outcomes , they have a constant underlaying rate
What does the K-M curve do?
It is used to estimate the survival function. Yu would then do the long-rank test to compare the survival amongst 2 groups.
What is regression?
…..a statistical method that can be used to determine the relationship between one or more predictor variables and a response variable.
The response variable is…
….the dependent variable (normally Y!), the variable of interest
What are the assumptions of logistic regression?
- Response variable (dependent variable) is binary (categorical)
- Observations are independent
- There are no extreme outliers
- There is a Linear Relationship Between Explanatory Variables and the Logit of the Response Variable
- Sample size is sufficiently large
What is the outcome of survival analysis
Time to an event
What is cox’s regression?
Cox regression analysis is a technique for assessing the association between variables and survival rate
What does Cox regression do that K-M cannot do?
Create a multivariate analysis
What would you do for further analysis on a K-M curve?
- log rank test to get a p value
- cox’s proportional hazard’s regression to get a hazard ratio
How do you word the reasoning for doing a meta analysis?
This meta analysis was performed to produce a summary estimate of the risk (or odds, depending on study type) of OUTCOME amongst EXPOSED
In addition to applying objective formulas, what are two other IMPORTANT reasons for doing a meta analysis?
To hightlight the direction of the association between EXPOSURE and OUTCOME
Individual studies may be under powered
In addition to applying objective formulas, what are two other IMPORTANT reasons for doing a meta analysis?
To highlight the direction of the association between EXPOSURE and OUTCOME
Individual studies may be under powered
Greater sensitivity means….
….lower specificity. MORE FALSE POSITIVES
Greater specificity means…
….lower sensitivity, MORE FALSE NEGATIVES