types of study Flashcards
what does observational epidemiology do?
describes patterns of heath and disease without intervening to change the factors which influence them
what do descriptive studies do?
they can measure the burden of illness
what can analytical studies do?
can investigate risk factors for a disease or outcome (does not necessarily mean they are causal)
what does interventional epidemiology do?
assesses the effect of a specific intervention
individual level or community
what needs to be said to define a case
boundaries of the case
unit of analysis
consider in context of the question we want to answer
cross-sectional study
to estimate the frequency or outcome at a particular point in time
Uses of cross-sectional study
- Health Service Planning – prevalence of specific outcome in a defined population at point in time
- Assesses a burden of disease and can plan preventative and curative services – not useful for rare diseases
- Generate hypotheses about causes
o association with current risk factors
o association with past exposure or early clinical signs
what can cross-sectional study find
ASSOCIATION not causation
descriptive cross-sectional studies
describe frequency of exposure or outcome in a defined population
analytical cross-sectional studies
simultaneously collect information on both the outcome of interest and potential risk factors in a defined population. Then compare the prevalence of the outcome in the people exposed to each rick factor with the prevalence in those not exposed
steps to making a cross-sectional study
- defining the study question
- defining the target population
- select a study population
- collecting data
* Clear case definition
* Clear exposure definition - Analysing data
* Prevalence
* Prevalence ratio – prevalence of outcome in exposed/ prevalence of outcome in unexposed - Interpreting results
* True association or reverse causality?
* Random error
* Bias
* Confounding
how do you define a target population
- Population of interest
- Select a sample of population
- Ensure sample representative otherwise selection bias
- Generalizability
- Random sampling – ensures sample representative
potential biases in cross-sectional studies
selection bias - characteristics of those taking part vs those not taking part
information bias - recall bias
how do you minimise selection bias in a cross-sectional study?
think about
* How did the people who are participating in your study get to be where they are? Was it related to exposure? Was it related to disease?
* Are those who ended up in your study representative of the source population? Could their participation relate to exposure? Could it relate to disease?
how do you minimise information bias in a cross-sectional study?
Can minimise this problem by having a strict case definition for the outcome of interest, by using standardised methods of data collection and, if necessary, by ensuring that the researcher who assigns the diagnosis is blinded to (not aware of) exposure status.
Strengths of cross-sectional studies
- Easy and economical
- Provides important information on the distribution and burden of exposures and outcomes- valuable for health-service planning.
- Can be used as the first step in the study of a possible exposure-outcome relationship
Weaknesses of cross-sectional studies
- Measures prevalent rather than incident cases- are of limited value for investigating aetiological relationships. Any association identified in a cross- sectional study is a measure of the effect of developing the outcome and staying in the population with the outcome
- Can be difficult to establish the time-sequence of events in a cross-sectional study. The exposure may have occurred as a result of the outcome (reverse causality)
what is survey sampling
a statistical process that involves selecting and surveying individuals from a particular population
can make statements about a population based on a small sample
if a sample is well taken it can be almost as informative as a complete census
Pitfalls in surveys
o Inaccurate data
o Non-coverage – who was missed?, what might they be like?
o Non-response – who didn’t reply?, why not?
how to simple random sample
- list the group and ensure group is representative of population
* Rare for all subjects to participate
* Non-participants may differ – selection bias
* Report information on non-participants - generate random numbers
- collect selected individuals
- collect data
Characteristics of a Random Sample
- Not haphazard
- Each subject has equal chance
- Number subjects 1 to n
- Computer generated random numbers
what is stratified sampling
when the population is divided into groups
people in each group tend to be similar - similar – take a random sample from each group
Allows representation of not only the overall population, but also key subgroups of the population
ecological study
observational study with populations or groups (instead of individuals) being unit of observation
use of ecological studies
- Describes associations at group level
- Quick and cheap- routine data
- Generates hypotheses- first step
- Some risk factors may not easily be measurable at an individual level: eg environmental pollutants
how is an ecological study done?
- Compares group averages – health and risk factors
- Background risk will practically always differ between groups (groups are mostly not made up randomly)
- Limit inferences from ecological studies
- Explore differences between exposed and non-exposed
But - Take advantage of natural experiments
- Data that are already available
ecological fallacy definition
An attempt to infer from the ecological level to the individual level is often called an ‘ecological fallacy’
what inferences can ecological studies make?
ecological inferences about effects at the group-level> they don’t enable us to make inferences of individual risk
what is routine data?
data collected routinely, in a standardised and consistent way, for administrative purposes rather than targeted in a specific study
what are the advantages and disadvantages of routine data?
Advantages
- Often covers large populations, even whole countries
- Readily available + Cheap as already collected
Disadvantages
- Often not up-to-date given reporting delay and processing time
- Can be incomplete (except for census)
- Variable of interest may not be collected
- Can be influenced by political pressure or financial constraints
use of routine data in epidemiology
- To generate Hypotheses
- To assess a population and describe its baseline characteristics
- To estimate disease prevalence or incidence of events -> sample size calculations when planning detailed studies
what is record linkage
- uses anonymised personal identifiers
- to link
o mortality
o hospital discharge
o prescribing (Tayside)
o laboratory tests (Tayside) - to examine
o predictors of mortality
o quality of care
o adverse drug reactions
what is a prospective cohort study?
Measures ‘todays’ risk factors and then follows up
Need to keep good records of both groups into the future
More expensive and time consuming but there is a defined cohort with detailed records
what is a retrospective cohort study?
Measures the outcomes of interest ‘today’
Need to obtain detailed enough records of both groups looking at the past
Get faster answers, cheaper as follow up doesn’t have to be done, need to investigate quality of the past records
what are the main pitfalls in a cohort study?
atypical study group
losses to follow-up
purpose of cohort studies
- Infrequent/unusual exposure e.g. radiation
- Multiple outcomes related to infrequent exposure
- Sometimes for a rare outcome - follow-up power
- Temporal sequence needs to be established
- Interested in risk over time (e.g. change in risk with more years of smoking)
Steps in a cohort study
- Decide the study question
- Define target population
- Select study population
- Measure exposure
- Determine appropriate method to determine outcome
- Consider potential confounders and how you can measure them
- Logistics of follow-up – determine how much is needed
- Determine appropriate statistical analyses for association testing
- Interpret results
how to measure exposure
- Questionnaires
- Laboratory tests/physical measurements
- Medical records
- Occupational records
- Civic/governmental records
important in measuring outcome
Must ensure outcomes are measured in the same fashion for exposed and non-exposed. Tendency to look more closely in exposed group differential bias
1. Clinical assessment
2. Questionnaires
3. Medical records
4. Government/civic records
assumption in cohort studies
participants are identical in all aspects but exposure
what is a cofounder
related factors to the outcome and exposures – not in the causal pathway. You can control for confounders, accurate analysis need to collect data on all confounders and adjust the analyses accordingly
how do you make a comparison group in a cohort study
- Compare to a comparable population (age, gender, socio-economical class)
- Use general population data on incidence rates
How to calculate Risk cohort study
Step 1: make a contingency (or 2x2) table
Step 2: what kind of question do you want to answer?
* what is the risk of outcome for those exposed?
* What is the comparative risk of outcome for those who are exposed compared to those who are unexposed?
* Can we calculate a “rate”? This requires a time component… unique advantage of cohort studies is follow-up…
comparison of risk across groups - cohort study
risk ratio (RR) assess the strength of the association – compares incidence of disease between exposed and unexposed
Risk difference (RD):
measures clinical and public health importance of the causal relationship
Risk Ratios
- RR >1 – exposure is harmful
- RR <1 – exposure is protective
- RR = 1 – exposure does nothing
(risk ratio = 15.6 then risk of disease is 15.6 times higher than those without exposure)
what is a case control study?
Cases – have disease of interest
Controls – do not have the disease
Where cohort studies measure disease in exposed vs unexposed, case control studies SELECT (diseased) and (undiseased) controls. Starts with disease then finds exposures
6 steps to a case control study
- decide sample size
- case selection
- control selection
- collect exposure
- data analysis - odds ratio
- deal with the validity of threats
how to avoid selection bias?
o Selected cases should be truly representative of all cases in source population, which you sample your control group from
o Clinic based sampling is vulnerable to selection bias, incidence rate calculation is not possible and the generalizability is uncertain compared to population sampling
o Cases must be sampled independent of exposure, if association known/suspected can be introduces. Intensified monitoring in exposed vs unexposed (higher likelihood of diagnosis in exposed then, exposure avoidance if high risk of disease = lower likelihood of exposure in diseased)
what is selection bias?
relation between exposure and disease is different for those who participate and those who are theoretically eligible but do not – procedures to select subjects or factors which might influence
what is information bias?
Occurs when errors in the measurement of characteristics and consequences of the errors are different for exposed vs unexposed/cases vs controls
when an association between exposure and disease is suspected may lead to
a higher likelihood of diagnosis in the exposed
a lower likelihood of exposure in those with risk factors for the disease (e.g. patients who had an episode of the outcome of interest before = prevalent cases)
what is misclassification bias
true case wrongly labelled control
true control wrongly labelled case eg if knowledge of exposure status influences diagnosis
- Impact
if random - usually weakens observed effect
if associated with exposure – unpredictable
- Minimise – ensure accurate objective diagnosis
what is observer bias
Knowledge of case/control status may influence data collection
Impact – identify more (spurious) risk factors in cases
Minimise by
Use of standardised objective instruments
Blind researchers to case/control status
what is recall bias
Cases and controls recall prior exposures differently
Minimising Recall Bias
Minimise period of recall (if possible)
Measure exposure data objectively - Medical notes or Third-party verification of exposure information
what is survivor bias
If exposure is rapidly fatal what happens to
number in cell a* (patient dies before he/she becomes known as case)
Will detect factors that increase survival among the diseased as risk factors for the disease
what to do if confounding
stratification
Advantage – effectively controls confounding
Disadvantages – reduces power and makes presentation and interpretation complex
Matching cases and controls
Advantage – effectively controls confounding
Disadvantage – matching on several factors makes it hard to find controls and cannot explore association of disease and matched variable
Multivariate methods (logistic regression)
Advantage – can cope with, several factors at once and continuous and categorical data
simple if unmatched
conditional logistic regression for matched
Disadvantage/Caution - May hide effect modification by treating effect modifiers as a nuisance
what is effect modification
a factor which modifies the effect of exposure to a risk factor ie exposure to the risk factor has different effects at different levels of the effect modifier
what is an odds ratio
Risk of disease given exposure (like the RR in cohort studies) in most circumstances
how to calculate odds ratio
what are genes and how do they work
genes are sections of DNA which determine our characteristics
they work and can either cause disease mendelian or with poly genetic be a contributing factor to disease
What genes have to do with disease and disease prevention
disease mendelian and polygenic can increase and decrease risk of disease
certain characteristics more at risk - more at need for prevention in some groups
Principles of research in genetic epidemiology
study of families
- Familial Aggregation Analysis – are relatives of a person with the disease more likely to have the disease than the general population?
- Twin studies – Quantifies the relative contributions of genetic and environmental factors to a disease
- Adoption Studies - An alternative method for establishing the relative contribution of genetic and environmental factors to a disease – Compares disease concordance with biological parents to concordance with adoptive parents
- Segregation Analysis - Analyses the mode of inheritance of a disease and how many genes are involved
The use of genetic knowledge in the field of public health
can focus the advice on people with a higher genetic likelihood for disease
screening can be carried out
genetic testing of family members were prevention is possible
The ethical issues and history surrounding public health and genetics
is it ethical for family members et. issues like abortion and having kids
what is the crude mortality rate
total number of deaths/ total population for specific time period
what does crude mortality rely on
- depends on the age/sex structure of the populations
- country with a higher proportion of old people will have higher number of deaths and higher crude mortality rate
what is standardisation
often used to control for confounding effects of age so that the rates of disease or mortality can be compared in populations with different age structures
direct standardisation calculates
new death rate - if age specific rates had occurred in the standard population
- obtain an age standardised rate which adjusts for the effect of age
to find direct standardisation you need
o the age-specific rates for all populations under study (or the data to calculate them)
o an appropriate standard population with a known age distribution
indirect standardisation finds
number of deaths expected if both populations had the same (standard) age-specific death rates, but kept their real age structure
to find indirect standardisation you need
o Age-specific mortality rates for a standard population
o The age structure of the study populations
o The total number of deaths in the study populations
Standard Mortality ratio (SMR)
a measure expressed as either a ratio or percentage, to quantify an increase or decrease in mortality in a study cohort compared to the general population
SMR =
= O / E * 100
Compares observed (O) with expected (E) deaths
Input needed
- Number of persons in each age group in the population being studied
- Age specific death rates of the general population by the same age groups
- Observed deaths in the study population
advantages and disadvantages of direct standardisation
Advantages
- Less bias
- Better when comparing 2+ groups with different age distributions (age-specific rates of study populations applied to same standard population)
Disadvantage
- needs age specific rates for study populations and these are not always available or reliable (if based on few cases in some age groups)
- In small, subpopulations with few cases Indirect method is preferred over the direct method
advantages and disadvantages of indirect standardisation
Advantages
- only need Total observed cases/deaths in study population and its age structure
Disadvantage
- SMR depends on age distribution of study population
- (since age-specific rates of standard population applied to each study population by age group)
- SMRs can only be compared between populations with similar age structure
what can cause populations to differ in age structure
- Developing and developed countries
- One country at two time periods
- Occupational groups
- Social class
Identifying confounders
1.The confounding factor must be associated with both the risk factor of interest (exposure) and the outcome.
2. The confounding factor must be distributed unequally among the groups being compared (E+ and E-) or (O+ and O-).
3.A confounder cannot be an intermediary step in the causal pathway from the exposure of interest to the outcome of interest.
Dealing with confounding in analysis
- Must collect information on all known potential confounding factors
- Explore for confounding in the analysis
- Practically, difficult to know which are the important confounders so you check magnitude
- Once identified, adjust regression models for confounders (multiple or multivariable regression)
Confounding effects
- Confounding will give you a flawed answer
- May account for all or part of an apparent association
- May cause an overestimate of true association (positive confounding) or an underestimate of the association (negative confounding)
chance - what needs to be determined
- need to determine as whether the finding is because of chance or what is being measured
- Null hypothesis (H0) v. alternate hypothesis
how do you do a statistical test
- Propose a hypothesis, and by extension propose the null hypothesis i.e. no effect
- statistical test and calculate P value
- 95% CI: If p value < 0.05 à conclude chance is unlikely. Therefore, the effect is real or true.
- If p>0.05 cannot exclude chance ie cannot conclude there is real effect
The meaning of p<0.05
If p=0.05 would get result (as extreme) 1 time in 20
If p=0.01 would get result (as extreme) 1 time in 100
what does the confidence interval tell you
- Most of the time (95%) the confidence interval will contain the real value
We know
i) observed treatment difference
ii) our result is affected by chance
We need to know
i) where the true value might lie
CI in different situations
- For difference in mean treatment effect
o if zero within confidence interval – NOT significant - For ratio measures eg relative risk
o no difference if ratio = 1
what shows the strength of an association
- relative risk
- hazard ratio
- odds ratio
what shows impact of exposure in the population
attributable risk
population attributable risk
attributable risk % =
what is attributable risk
difference between the incidence in the exposed and that in the unexposed
incidence in exposed – incidence in unexposed AR= ie-iu
other ways to calculate attributable risk
cohort study - RR-1/RR x100
- There is no way to calculate AR in a case-control study
However if - Exposure is not very common
- Risk is not very high
Then – odds ratio approximates relative risk
what is population attributable risk (PAR)
incidence of disease in a population attributable to the risk factor
PAR =
- absolute difference between risk in the total population (It) and unexposed population (Iu)
PAR = It – Iu
PAR = AR x Pe (Pe – prevalence of exposure in whole population)
usefulness of PAR value it is
- excess risk of disease in total population attributable to exposure
- reduction in risk achieved if population entirely unexposed
- helps determining exposures relevant to public health in community
PAR % is
- proportion of cases in the population attributable to the exposure
- PAR expressed as a percentage of total risk in population
- Proportion of the disease in the population that could be eliminated if exposure were eliminated
PAR % =
what is the problem with population attributable risk
Problem – getting the data
* Incidence in the total population
- From routine data
* Incidence among the exposed and non-exposed
- Use data from cohort studies
- Assume it applies to the whole population
causation
implies that there is a true mechanism that leads from exposure to disease
what is a cause
- A factor which increases the frequency of a disease (outcome/event)
Rothman’s model of causation
- No single causes of disease
- Cause of disease – a constellation of components that act together
- 2 types of cause – necessary and sufficient
what is a necessary cause
- An exposure which is necessary for disease to occur
- A necessary cause must always precede the disease – HIV and AIDS
- Necessary cause may not act alone – not everyone with HIV gets AIDS
what is a sufficient cause
- Set of conditions is a sufficient cause when it always produces the outcome
criteria for causal inference
- Strength of the association
- Consistency - replication
- Specificity of the association
- Temporal sequence
- Biological gradient
- Coherence
- Experiment
- Theoretical basis
- Analogy
causality at individual level
- A given sufficient cause requires the joint action of many component factors (component causes)
- Sufficient causes can vary between individuals
- component causes can act far apart in time.
- a strong factor is a component of many causal pies
- blocking the action of any component cause prevents the disease
how are new treatments carefully assessed
randomised controlled trials
- randomisation enables fair comparison
- need a fair outcome assessment
structure of a RCT
PICO
population
intervention
comparator
outcome
what are the common problems with RCTs
unequal at baseline
lack of blinding
loss to follow-up
poor outcome measurement
when is stratified randomisation useful RCT
useful if other risk factors have a strong influence on the outcome
what is restricted randomisation
refers to any procedure used with random assignment to achieve balance between study groups in size or baseline characteristics
what is a crossover study?
- Each patient gets both treatments, with half receiving A first and half receiving B first
- Patient is own control
o reduced variance
o much smaller sample size - Requirements
o no carryover A to B or B to A
o no change in disease severity
what is cluster randomisation
- when people are organised in natural groups
- cluster are randomised not individuals
complications - Need to randomise lots of clusters – If few clusters could get unequal distribution after randomisation i.e. not the same at baseline
- Need increased sample size (of individuals) – subjects within a cluster, more similar than subjects in other clusters, Intra-class correlation coefficient – ideal is lots of small clusters
- Allocation concealment more difficult
- Analysis more complicated – need to take account of cluster design
what is a stepped edge design
- Cannot deliver to all clinics at same time
o Randomise clinics into a sequence for treatment
o All clinics receive intervention but not at same time - Possible uses
o New treatment – clinic by clinic
o Provision of piped water, free school meals or prisoner vocational training (always place by place)
design of stepped wedge
- Time of crossover is randomized ◦ crossover is unidirectional
- Need to be able to measure outcome on each unit at each time step at same time
- Same disadvantages as cluster randomisation ◦ best with lots of clusters/ steps ◦ analysis complex
- Units act as their own control ◦ helps reduce no. units needed (same as cross-over design)
advantages RCT
- Intervention and control groups will be similar in all respects except the intervention minimising selection bias and confounding
- If the participants are “blind” to the treatment allocation, reporting bias is minimised; if the investigators are “blind” to the allocation, observer bias is minimised
- RCTs carry less risk of bias and confounding than other study designs and so can provide powerful evidence of a causal relationship between the intervention and the outcome
- Intervention studies are similar to cohort studies in that:
o multiple outcomes can be examined
o the incidence rate of the outcome can be measured
disadvantages of RCT
- Expensive to conduct: they may require a large study team, perhaps at several sites, and may require a long follow-up period.
- In some situations, intervention studies are impossible to conduct for ethical or logistical reasons.
- Recruitment is difficult and time-consuming
- Trials take years to do
what is intention to treat analysis
- Compares outcomes for all randomised individuals – even if they stop taking treatment or drop out of study.
- Assesses the overall effect of assigning a subject to receive a particular intervention.
- Analysis is the most important and “safest” analysis as intervention and control groups compared as originally randomised.
- More likely to underestimate treatment effect
- Often use a modified ITT analysis – all randomised participants with some follow up data
what is per-protocol analysis
- Opposite of ITT
- Includes only those who finished the trial and took the drugs / underwent the intervention
- Maximises likelihood of showing a difference between groups
- May minimise harms
- May introduce bias due to selective dropout (more on dropout later)