7 - Systematic Reviews Flashcards
Define systematic review. Does it include meta analysis?
- application of scientific (or systematic) strategies to limit bias of relevant studies in a specific topic (ie in the gathering, critical appraisal, and synthesis)
- may or may not include meta analysis
when is it appropriate to do a systematic review vs meta analysis?
- it is always appropriate to do a systematic review, but not always appropriate to pool data (as in a meta analysis) - ie it may be misleading
define a meta-analysis
- a statistical analysis of results from independent studies (used to produce a single estimate of the treatment effect)
- aka pooling (averaging results together)
- generally weighted by error or sample size within that study - ie smaller study has less of an impact on the average from a larger study
describe a forest plot. size of CIs wrt box size
- represents meta analysis
- on the left list studies in order of date
- plot square size representing the size of the study
- larger box size = larger studies = smaller CIs
- diamond at the bottom represents pooled estimate of effect (either by error or sample size)
what are some reasons to do systematic reviews?
- single studies often do not find significant effects (too small n size, pool for more)
- we can answer questions about subgroups
- can extend the generalizability of results
what is involved in a systematic review?
- formulate the question(s)
- determine criteria for inclusion/exclusion of studies
- conduct literature search
- select relevant studies
- assess quality of included studies
- extract the data
- assess sources of heterogenity
- analyze and present results
name some criteria for eligibility of studies in SR
- study design, population, intervention, outcome, follow-up length, methodological quality (leave this part out at the beginning though!)
main points about search strategies
- use more than one source (there are peer-reviewed and unpublished sources - ie cochrine controlled trials register, clinicaltrials.gov)
main points about selection of relevant studies (pt 1 and pt 2)
Part 1
- pairs of reviewers search for titles and abstracts independently to identify articles that should be reviewed in full text
- first need to define eligibility criteria, then create/pilot the data form, independently review (as exclude or include/review full text), determine/report agreement, unweighted Kappa
Part 2
- expand and further define eligibility criteria
- same steps as before but with full text screening (include/exclude/uncertain), keep data of excluded studies and provide a reason for it, kappa again
what is kappa/how do you calculate kappa?
- kappa lets the readers know how well we agreed - greater value = more agreed = well defined search = more likely to include important studies
- if not to questions, exclude
describe how to determine completeness of the systematic review search you did
- look at cited works in articles you included
- look back and basically determine whether you did a good job or not
- look for evidence of publication bias
what is publication bias?
- omission of studies that should be included
- when positive trials are more likely to be published than negative trials it is a publication bias
- when studies are omitted (bc they were never published or maybe we didn’t do a good job in our search)
what is time-lag bias?
- when positive trials are more likely to be published rapidly
what is a language bias?
when positive trials are more likely to be published in English
what is a multiple (duplication) publication bias?
when a positive trial is more likely to be published more than once
what is a citation bias?
when a positive trial is more likely to be cited by others
what does exclusion of small negative studies do to estimation of effect?
it tends to overestimate the estimation effect
what is the most common graphical method to detect publication bias?
funnel plot - plots treatment effect by trial size/standard error
what is effect size again? what are small medium large values?
- can be calculated a number of different ways
- basically a simple way of quantifying the difference between two groups without confounding this with n-size (better than p-values alone)
- 1 = large, 0.5 = medium, 0.2 = small
Look at the funnel plots and describe where small negative studies are, compared to larger studies (also label axes)
- y axis = standard error, x is odds ratio or effect size
- small negative studies (right and low)
- larger studies higher on y axis (has to do with effect size)
- treatment effect (expressed n OR, a type of treatment effect) are larger for small studies
- if skewed to the left, missing the negative small studies
Describe the trim and fill method for funnel plots
trim = take top part of the triangle that is complete and calculate OR fill = statistically fill in triangle and calculate that OR
What is quality assessment? Describe. How is external validity addressed?
- indicates the internal validity of each included study
- evidence shows differences in treatment effect for high-quality vs low quality studies (ie low quality bias in favour of treatment)
- external validity would be how you pick which studies to include/how you look at heterogenity
Why do people like using quality scales?
- avoids thinking (maybe they don’t know how to identify important individual criteria)
- easier to portray than assessing each individual criteria
- exclude studies of low quality
- weight studies according to their quality rating
- explain heterogenity
what are concerns about the validity of quality scales?
- scoring implies weighting of criteria (do we give more important criteria more weight? did the study give the most important criteria the most weight?)
- scales often include external validity or reporting criteria (should not be included)
describe the importance of individual criteria for including studies in SR
- it is dependent on the influence on internal validity for a specific trial type (ie concealment is more important for RCTs etc)
Describe the results of the Juni et al. study
- applied 25 quality scales to studies
- found wide range of effect size
- some cases showed no correlation btw quality score and effect size, others showed counter-intuitive results - check out graph (p 10)
Describe the Jahad scale
- look at p 11
- the scale is not perfect (includes some things it should and some it shouldn’t, missing some things too) - ie reporting issues should not be mentioned here
- want to know in terms of internal validity!
- look at second example too
Describe rating quality - what you should do for a quality assessment (determining included studies)
- consider blinding
- create and pilot forms
- duplicate independent ratings (ie things in partners)
- discussion and consensus
- determine reporting agreement (lets readers know how well everyone understood what was going on)
In terms of determining reported agreement, what types of analysis are there?
- ICC if quality rating conclusion is a score
- weighted kappa if QR is an order category
- kappa if QR is a non-order category
describe data extraction for included studies
- consider blinding data extractors to author, institution, journal etc
- create and pilot forms
- duplicate independent review (ie things in partners)
- do you need to write to author for more info?
- discussion of results w consensus (ie how do we deal w disagreements - needs to be transparent for readers)
describe the cochrane collaboration tool for assessing risk of bias
- pg 12
What is a meta-analysis (pooled analysis) and when is it appropriate?
- estimates a single treatment effect (therapy) or sensitivity/specificity (diagnostic), estimates of prognosis etc
- if results are not heterogeneous/heterogeneity can be explained (if not, do NOT pool results)
- note that explanations for heterogeneity must be done a priori! (if the hypothesis we came up with doesn’t explain the heterogeneity we should not pool results!)
- see example on page 13
what does heterogenous mean?
- that the results of the included studies are variable
what are potential sources of heterogeneity? external or internal validity? examples.
- clinical heterogeneity (external) = difference in clinically important features (patient selection, baseline disease severity, administration of intervention, management of outcomes, adverse events, complications, duration of follow ups etc) methodological heterogeneity (internal) = refers to differences in methodology (randomization methods, allocation concealment, proportion lost to follow up etc)
what is directionality?
- saying that something has a larger or smaller effect and not just that there is a difference in treatment
- always include directionality when possible
list 2 ways to evaluate heterogeneity
cochrane chi-square test and I^2 value
what is the cochrane chi-square test
- a heterogeneity test
- tests whether estimates of effect between studies are similar (homogonous), low power, p<0.05 implies significant heterogeneity (bad)
- this is one case where we want a larger p!
- effected by amount of studies in your review (small = not as much heterogeneity found, large may find heterogeneity when there really isnt)
what is the I^2 value?
- estimates the percentage of total variation that is due to heterogeneity among studies rather than due to chance
- low (25%), medium (50%), high (75%)
- so smaller value is better (means small part of variation is due to heterogeneity or differences btw studies)
list the steps for evaluating heterogeneity
- create hypothesis that can be tested later if heterogeneity is found (a priori!!) - eg studies that included only patients w severe disease will show smaller treatment effects than studies that only included patients w mild/moderate disease
- conduct the analysis and create forest plot
- test your hypothesis (start by separating studies into those that included only severe patients and mild/mod patients)
- re-run analysis and produce new forest plots - see if heterogeneity went away! - if yes, this explains it
- present the results for each group (conclusions/recommendations should be separate for each group)
describe within study variability - fixed/random effects?
- for conducting meta-analysis
- variability if the same study with the same subjects is repeated
- fixed and random effects
describe between-study variability - fixed/random effects?
- variability if the same study is repeated in a different population
- only random effects
explain random vs fixed effects - which is more conservative?
- how we are going to pool the results for a meta analysis (remember signal over noise)
- for random, the denominator includes more sources of noise, therefore tougher to see the signal (this is more conservative so we want this!)
- for fixed effects, does not include within study variability - can see signal more clearly (its assuming that all studies pool from same population) so less conservative
describe choice of summary measures for meta-analysis (weighted mean diff or standardized mean diff)
- WMD (OR, RR, mean), means that all the studies pooled into the systematic review used the exact same means to measure outcome (rare to have this)
- SMD (standard deviation units) means you have to place all the scores on a standardized or z-scale and then combine them
how can you use the results of a meta analysis? as a clinician?
- clinical decision-making guidelines
- policy making
- ethical design of future trials
- can improve our generalizability, gives us more applicability, more precision around estimate of effect, can say we are now more certain about these results
- as clinician remember: variation is likely quantitative, not qualitative, can consider results in most reverent subgroups, still requires incorporation of patient preferences (this bullet not talked about much in class!)