Research methods Flashcards
Blinding vs concealment
Single blind- patient blinded
Double blind- patient and clinician blinded
Allocation concealment- third party/ hidden method used to allocate groups, don’t know what each groups treatment options are. Reduced selection bias, acts similar to randomisation. Especially when blinding not possible due to nature of the intervention.
ITT vs Per protocol
ITT- all participants included in analysis regardless of if they complete treatment.
+ Preserves randomisation
+ greater generalisability, reflects clinical situation
+ maintains sample size
PP- only analysis those who kept to study protocol
- harder to generalise
- may introduce bias
Strengths and weakness of vital statistics
Government collected population level data
+ cheap and easily available
+ mostly complete
+ Contemporary
+ Used for monitoring trends
- Not 100% complete
- Potential for bias (underreporting, post mortum status inflation)
- Can become put of date (census)
Ways to improve routine data quality
Computerise data collection and analysis
Feedback of data to providers
Presentation of data in a variety of ways
Training
Sources of routine stats in England
Census
Mortality stats (ONS)
Morbidity- GP codes, Clinical practice research database, HES, lab results,
National registries (cancer, Congenital abnormalities, Prostheses, transplants), Confidential inquiries
Notifiable diseases, general lifestyles survey
ONS Psychiatric Morbidity Survey
The Association of Public Health Observatories in the UK
Dimensions of descriptive epidemiology
Time. E.g Secular trends (decades/centuries), seasonal, Epidemics, point events)
Place: Where the incidence is high/low
Person: Who is affected? Demographics, occupation, behaviours.
Right censoring
Subjects leaving the at risk population in a cohort study. E.g lost to follow up, die from other diseases.
Left censoring
Subjects joining after the event has occurred. Uncommon, and subjects mostly excluded.
Incidence rate
New cases/ person time at risk
Cumulative incidence
No of new cases/ population at risk
In any given time period.
Assumes a closes population.
e.g attack rate during a pandemic
Direct standardisation
Age specific mortality rates of study population are KNOWN.
Mapped on to reference population to make the rate comparable for differently structured populations.
Age standardisted rate
Indirect standardisation
Age specific rates are NOT KNOWN. Often true in smaller populations e.g ethnicities
Apply age standardised rates from reference population on to study population to calc expected deaths. compare actual and expected.
Standardised mortality rate
Caution using this i n occupational exposures, as whole population contains the sub group. Often require comparison with two groups.
YLL and HALE
YLL- Years of Life Lost
Sum of years lost up to 75
Weights to death at younger age, underestimates burden from chronic disease
HALE- Health adjusted life expectancy
Sum of number of life years lived x health state score.
Attributable risk
Risk that can be attributed to the exposure.
Absolute risk in exposed group- absolute risk in unexposed
e.g Incidence of CHD in smokers vs non smokers
0.1-0.01= Attributable risk of CHD in smokers 0.09
Attributable fraction
What proportion of the disease in the exposed group can actually be blamed on the exposure.
E.g
AR in CHD smokers = 0.09
0.09/0.10= 0.9= 90%
90% of CHD in smokers can be attributed to smoking, as 10% would have occurred anyway.
Population attributable risk
Excess rate of disease in the whole population that is attributable to exposure
Rate in whole pop- rate in unexposed
Population attributable fraction
Effect of exposure on the whole population as a proportion
Rate in whole pop- rate in unexposed (PAR) / Rate in whole population
Risk ratio
Risk of disease in exposed/ risk of disease in unexposed
Calc using 2x2 contingency table
Rate ratio
Incidence in exposed/ incidence in unexposed
Odds ratio
Odds of exposure in diseased (case control) or odds of disease in exposed.
Calc via 2x2 contingency table
(a/c) / (b/d)
Reverse causation
Where the outcome causes a change in exposure
e.g
Breast feeding and poor growth in developing countries- actually due to poor weaning
Sleep and Qol
Drugs and psychological harm
Bradford Hill criteria
Criteria to assess causality
Strength of association- The greater the association the more likely it is due to causation (not true in reverse)
Biological plausibility
Consistency of findings
Temporal sequence
Dose response
Specificity - If the exposure causes on or more outcomes.
Coherence - No conflict with the natural history of the disease
Reversibility- remove risk, disease reduces
Analogy- similar to other established cause-effects
Types of selection bias
Volunteer, Control, Healthy worker effect, follow up bias
Types of measurement bias
Instrument, responder (recall, placebo), observer
Minimising bias
Randomisation
Blinding
Irrelevant factors- collect irrelevant factors to check bias between groups
Repeated measurement - inter observer agreement
Training
Written protocol
Choice of controls
Ease of follow up
High risk cohorts - increase event rate
Duplication/ triangulation
Confounding
A variable that can influence both the dependent variable and independent variable, causing a spurious association.
Residual confounding
Confounding effect when all known confounders have been felt with. this can be reduced with randomisation as these effects are equally distributed between groups.
Effect modifiers
Where the effect of the exposure on the outcome is modified by a third variable. e.g smoking and CHD- worse effect of smoking younger so age is an effect modifier
Analysis of results alongside different age bands should be completed, along side a Chi sq test of heterogeneity
Controlling for confounding: Design stage
Randomisation - In large samples this is effective at minimising confounding. But not always possible
Restriction - limit sample to one group e.g to reduce effect of age and ethnicity. Cheap and easy method, less generalisable results, may get residual confounding
Matching- useful in smaller studies, difficult and expensive, no control when factors can’t be matched.
Controlling for confounding: Analysis stage
Stratification- Mantel-Haenszel method. Divide confounders into strata and provide strata specific estimates (with CI), and weighted average of overall effect. Only controls for a few confounders
Standardisation
Multivariate analysis- multiple regression and logistic regression. Transparency lost, but overall preferred method.
Case study/series
Hypothesis formulation, descriptive, individual based.
+ Rapid, low cost
- No causation/analysis
- not generalisable
- No comparison group
- Not assessing disease burden
Ecological studies
Descriptive, hypothesis generating, population level
Compare large groups.
+low costs and quick
- unknown confounders
- Only considers average exposure
- Spatial auto correlation (assumes all areas are independent)
- leakage of exposure through migration
- Not individual, ecological fallacy
Cross sectional
Can be descriptive, analytical and/ or ecological. Hypothesis formulation.
Simultaneous prevalence of exposure and disease
Disease frequency (odds/prevalence)
Sample representative of population.
+ Multiple exposures and outcomes
+quick and cheap
+Useful for rarer diseases
+can detect disease burden
- Prevalence not incidence (Can’t differentiate determinants and survival)
- risk of reverse causation (no temporality)
- Recall bias
Case control studies
Identify disease and exposure of interest. Cases defined as with disease, controls matched other than without disease.
Can be retrospective or prospective. Odds ratio measures
+Rapid and cheap
+ ideal for rarer diseases/outcomes
+ Disease with longer latent periods
+ Can measure large number of potential exposures
- Selection bias
- Hard to assess temporality
- Recall bias
- Poor for rare exposures
- no incidence
- Misclassification of disease/outcome skews results
- Data fishing
Cohort studies
Exposure and outcome identified.
Select cohort of exposed patient, disease status unknown.
Measures incidence, rate, risk, mean, median. RR, AR and survival analysis
+ Temporal
+ Good for rare exposures
+ Multiple effects of single exposure
+ Minimal selection bias in prospective study
+ Good for long latency
- Expensive and time consuming
- Loss to FU
- not good for rare disease
- Retrospective - poor records
- Healthy worker effect
Intervention studies
Exposure is allocated.
Stopping rules: Indépendant group monitor interim results. Stop for extreme positive or negative results, unblind for serious single events, high significant is required to stop.
Non compliance can lean result to the null
Analysis of frequency, effect, placebo effect and ITT
+high quality
+ Valid
+ Bias minimised if done well
- Generalisability sacrificed
- high cost
-ethics - Bias from loss to FU, observation bias, placebo
Can be cross over, cluster and factorial
Limitations in health research context:
Resources, Timescales (prevention), Changes in policy, differences across the country, difficult to study organisational changes.
Small area analysis
+ Large analysis may hide variation at regional level e.g coastal areas
- There may be little variation of exposure at smaller scale
- data errors and chance have a greater effect on results
- Poor quality data available
Validity
Degree to which an instrument measures what its supposed to measure.
Criterion- Concurrent (compared to gold standard) and predictive.
Face- how well it corresponds to expert option
Content- is it representative of the issue
Construct- does it represent the construct.
Improving validity- measure against gold standard, triangulation, address measurement bias.
Reliability
Consistence of instruments performance
INTRA - observer- same observer, same subject
INTER observer- multiple observers same subject
Measured with a correlation coefficient/KAPPA. >0.7 is generally deemed reliable.
Equivalence - two instruments
Internal consistency- within the instrument e.g specific questions on a survey
Clustered data
Groups/linked data
- Need to adjust sample size to compensate for individuals within a cluster being more similar to each other (ICC- intra cluster coefficient)
- Calc summary stats for each cluster,
- Calc robust standard errors
Using ANOVA
- Random effect models - analyse similarities between individuals within a cluster
Fixed effects- assumptions about the independent variable
NNT
Reciprocal of absolute risk reduction
How many patients do I need to treat to benefit one patient
+ More initiative
- not generalisable to populations where the baseline risk of disease differs
- Can only compare NNT for different therapies of baseline disease risk is the same
Time trend analysis
Describing events over time periods, account for seasonality/ change over time, assessment of exposure/outcome over time, Evaluate impact of an intervention, Effect of an unplanned event, projections.
Analysis using moving averages, and segmented regression analysis
Examples: Time series designs ( two time points in series)
Repeated measures before/after, Crossover (baseline, intervention, baseline), At different locations.
- Secular changes - e.g demographics
- Concurrent interventions/events/ exposures
- Latency periods
- Diffuse exposure
- Seasonal changes
- Auto correlation - for some outcomes the value at one time point affects the value at another - needs adjusted in analysis
Probability sampling
Requires a sampling frame (complete list of the population from which the sample is to be drawn), sampling error can be calculated.
Random
Systematic
Stratified
Cluster
Non probability sampling
Convenience
Purposive
Quota
Snowball
Types of randomisation
Simple- unrestricted randomisation. potential to create unequal group sizes
Blocked- set group allocation and block size. Vary block size to vary sequence. Allows for equal groups.
Stratified- randomisation form within strata
Cluster- groups randomised
Matched pair
Stepped wedge- intervention randomly introduced to all groups over time. Good if intervention is thought to be beneficial.