Critical appraisal course Flashcards
What is EBM
The conscientious, explicit and judicious use of current best evidence in making decisions about the care of a patient
5 steps of EBM
- Clinical question
- EVidence
- Critical appraisal
- Application
- Implementation and monitoring
Stages of critical appraisal
- The clinical question
- Methodology: study design, recruitment, variables and outcomes
- Results: data analysed and differences between groups assessed for significance
- Applicability
Internal and external validity
Internal: extent to which the results from the study reflect the true results
External: extent to which study results can be generalised
Efficacy and effectiveness
Efficacy is the impact of interventions under optimal (research) setting
Effectiveness is whether the interventions have the intended or expected effect under ordinary clinical settings
The efficacy of an intervention is almost always better than the effectiveness
True
The acceptance of EBM means all clinicians practice the same
False
Patient values have no role to play in EBM
False
The effectiveness is almost always better than the efficacy
False
A research project taking place in the outpatient clinic is almost always going to give effectiveness data rather than efficacy data
True
An RCT can answer any type of clinical question
False
The clinical question determines which study designs are suitable
True
A clinical question can usually only be answered by one type of study design
False
The critical appraisal should start by examining the study design
False
Broad categories of studies and what they can achieve
- Observational descriptive
- survey, qualitative, case report/series
- Generate hypothesis - Observational analytical
- case-control (outcome->exposure)
- cohort (exposure->outcome)
- test hypothesis - Experimental
- RCT, crossover, N of 1
- intervention - Others
- Ecological: information about the population
- Pragmatic: real life environment. Example- all people in clinical location, outpatient, randomised to receive particular treatment. More reflective of everyday practice
- Economic
- Systematic review/MA
Case series is observational analytic
False
Case control is observational descriptive
False
Qualitative study
Opinions are elicited from a group of people with emphasis on subjective meaning and experience. Complex issues can be identified
Data gathering and data analysis develops iteratively-> results inform further samplig
Inductive-> knowledge generated through data sampling and gathering as opposed to other scientific methods where research is deductive
Other names for case control
Retrospective or case comparison
Case control- type, advantages and disadvantages
People with outcome variable, are compared to those without outcome variable, to determine risk factors they have been exposed to in the past
Adv: cheap and easy good for rare outcomes few subjects required good for diseases with long duration between exposure and outcome
Dis:
not for rare exposures
Recall problems
Control groups can be difficult to select
Cohort study- design, retrospective type, other names, adv, disadv
A group of people with exposure are followed up to see the development of an outcome
Also called prospective or follow-up
Retrospective type is using cohort data that already exists (say from 20 years ago, with exposure)
Adv: Good for rare exposures multiple outcomes temporal relationship estimation of outcome incidence rates
Dis: May take ++time from exposure to outcome expensive attrition rates unsuitable for rare outcomes
Recall bias is a bigger issue fir CC or Cohort
Case control
Which is better study design for rare exposures
Cohort
Whch study design is betten when long time from exposure to outcome
CC
Disadvantages of RCT
Expensive
Time consuming
When might you use a crossover trial
When unable to get enough subjects for RCT
Given one intervention, then switched half way through
Disadvantages of cross-over trials
The order f interventions might be important Carryover effects (drugs with long half lives) or prolonged discontinuation/withdrawal May be difficulty using historical controls if conditions were different
N of 1 trials
Experimental version of case report
Single person
Given randomised treatments
Report on response
Historical control bias is an issue in which type of study design
Crossover
Which types of biases are an issue in open label
Selection and observation bias
Two sources of methodological error
Bias
Confounding factors
Definition of bias
Any process at any stage of inference, which tends to produce results or conclusions that differ (systematically) from the truth
Not by chance
Researchers must try to reduce bias
categories of bias
- Selection bias= recruitment of sample
- Performance bias= running of the trial
- Observation bias= data collection
- Attrition bias
2+3= measurement bias
Bias versus confounding
Confounders are real life relationships between variables that already exist, and so are not introduced by the researcher
Selection bias
Error in recruitment of sample population
Introduced by: Researchers= sampling bias - admission or berkson - diagnostic purity - neyman bias - membership bias - historical control Subjects= response bias - volunteers differ in some way from the population - example if more motivated to improve their health, and adhere ++to the trial
Types of sampling bias
- Berkson bias
- sample taken from hospital setting, and therefore rates/severity of condition is different compared to target population - Diagnostic purity bias
- co-morbidity are excluded from sample population, and therefore does not reflect the complexity in the target population - Neyman bias
- prevalence of condition, does not reflect the incidence, due to time gap between exposure and actual selection. such that some with exposure are not selected (have died)
- example, giving treatment following MI. Some may die shortly after MI and therefore not be selected, therefore there is better prognosis already - Membership bias
- members of group selected may not be representative of target population - Historical control bias
- subjects and controls chosen across time, so definitions, exposures, diseases and treatments may mean they cannot be compared to one another
Protecting against performance bias
Systematic differences in the care provided, apart from intervention evaluated
Standardisation of care protocol and blinding protects
Randomisation and blinding
Types of observation bias
Failure to measure or classify the exposure or disease correctly. Can be due to researcher or participant.
Researcher
- Interviewer (ascertainment) bias
- when researcher not blinded, may approach subject differently depending on if they know they are taking the treatment or the placebo - Diagnostic/exposure suspicion bias
- Implicit review bias
- Outcome measurement bias
- Halo effect- knowledge of patient characteristics influences the impression of patients with respect to other aspects
Subject
- Recall bias
- Response bias- answers questions in a way they think the researcher would want
- Hawthorne effect- behaving in a way, usually positively, as aware being studied
- Social desirability bias
- Bias to middle and extremes
- Treatment unmasking
Attrition bias
The numbers of individuals dropping out differs significantly between the groups
Those left may not reflect the sample or target population
Intention to treat analysis will need to be conducted
Bias occurs by chance
F
lack of blinding could lead to ascertainment bias
T
What is a confounder, positive and negative
When there is a relationship between two variables, that is attributable to, or confounded by the presence of a third. May make it seem like the two variables are associated when they are not (positive confounder eg coffee and lung cancer *smoking, overestimating association), or fail to show association when there is (negative confounder poor diet and CVD *exercise, underestimating association)
To be a confounder
- Must be associated with the exposure, but not the consequence
- The outcome, independently from the exposure
Controlling for confounders
- Restriction
- inclusion and exclusion criteria - Matching
- Randomisation
Accounting for confounders using statistical methods
- Stratified analyses
- can only control for a few - Multivariable analysis
- Accounts for many, need at least 10 subjects, in a logistic regression
If matching was done, McNemar test or conditional logistic regression
Simple randomisation
Subjects are randomised to groups as they enter the trial
Selected independently of each other
Block randomisation
Differs from simple randomisation in that subjects are not allocated independently
Subjects are assigned to “blocks”, which as they fill, then distributed evenly to intervention or control group
Stratified randomisation
Subgroups are formed in relation to a confounding factor, then in each stratum, block randomisation occurs, so the confounders are equally distributed
Concealed allocation and methods
When the treatments being administered in the different arms of the study remain secret
Part of the randomisation process
Methods:
- Centrally controlled
- Pharmacy concealed
- Sequentially numbered, opaque, sealed envelops
- Numbered/coded bottles or containers
Allocating an equal number of subjects in each groups is possible with simple randomisation
Yes- possible by chance
Pygmalion effect (Rosenthal effect)
Subjects perform better than others because they are expected to
Power of positive expectations
George Bernard Shaw- Pygmalion
Placebo effect
In healthcare, the pygmalion effect is often called the placebo effect
Latin meaning of placebo
“I shall please”
2 methods to make expectations equal
- Allocation concealment
- at time of selection - Blinding
- once the subjects start treatment
Problem with single blinding
If the subject is blind, the researcher is still in full possession of the facts
The subject may still be influenced by the behaviour of the researcher who may have expectation about the outcome
Blind assessment
Assessment of the outcome measures during and at the end of the study is made without any knowledge of what the treatment groups are
Placebo factors
Multiple pills
Large pills
Capsules
Reliability definition and subtypes
Consistency of results on repeat measurements by one or more raters over time
- Inter-rater
- level of agreement by 2+ assessors at the same time - Intra-rater
- one rater, same material, different time - Test-retest
- level of agreement from initial test results to repeat measures at later date - Alternate form reliability
- reliability of similar forms of the test - Split-half reliability
- reliability of test divided in two, with each half being used to assess the same material under similar circumstances
Quantifying reliability
Compare the proportion of scores which agree, with the proportion that would be expected to agree by chance
Reliability co-efficient
Kappa (Cohen’s) statistic k
Kappa or Cohens is used to test measures of categorical variables
Inter rater reliability in qualitative
Also known as chance-corrected proportional agreement statistic
Measures the proportion of agreement over and above that expected by chance
If agreement is no more than expected by chance k=0
To be significant, k> 0.7 is normally necessary
Strength of agreement or association
0= chance agreement only <0.2 poor agreement beyond chance 0.21-0.4 Fair agreement beyond chance 0.41-0.6 Mod 0.61-0.8 Good agreement 0.81-1.0 Very good 1 Perfect agreement
Crohnbach’s alpha is
Used in complicated tests with several parts measuring several variables
When you have multiple Likert questions
Internal consistency/reliability
No formal test statistic
>0.5 mod
>0.8 excellent
Intraclass coefficienct
Used for tests measuring quantitative variables, such as BP
Validity and subtypes
Extent to which a test measures what it is supposed to measure
- Criterion- predictive, concurrent, convergent, discriminant
- Face
3, Content - Construct
- Incremental
Criterion validity
demonstrates the accuracy of a measure or procedure by comparing it with another measure or procedure that has been demonstrated to be valid
- Predictive: extent to which the test can predict what it theoretically should be able to predict
- Concurrent validity: extent to which the test can distinguish between two groups it theoretically should be able to distinguish
- Convergent validity: the extent to which the test is similar to other tests that it theoretically should be similar to
- Discriminant validity: the extent to which the test is not similar to other tests that it theoretically should not be similar to.
Other types of validity
- Face validity: superficially looks to measure what it should
- Content: measures variables that are related to that which should be measured
- Construct validity: extent to which a test measures a theoretical concept by a specific measuring device or procedure
- Incremental validity: the extent to which the test provides a significant improvement in addition to the use of another approach. A test has incremental validity if it helps to the use of another approach.
Intention to treat analysis
All the subjects are included in the analyses as part of the groups to which they are randomised, regardless of whether they completed the study or not.
Last observation carried forward, disadvantages
Way of accounting for subjects that drop out before the end
Disadvantages:
1. Underestimation of treatment effects
- intervention expected to lead to an improved outcome
2. Over estimation of treatment effects
Intervention expected to slow down a progressively worsening condition
per protocol analysis
only those subject remaining in the study are used in the analyses
introduces bias through exclusion of participants who dropped out
when is incidence preferred over prevalence and vice versa
When disease is frequent and short duration-> incidence
When long duration, slow, rare-> prevalence more useful to indicate impact of disease on the population
Mortality rate
Type of incidence rate that expresses the risk of death in a population over a period of time
Standardised mortality rate
Adjusted for confounding factors
Standardised mortality ratio
Ratio of observed mortality rate compared to expected mortality rate
Types of data summary
1. Categorical Nominal Ordinal 2. Quantitative Discrete Continuous
Categorical data
No numerical value
Not measured on scale
No in between values
- Nominal= unordered
- binary, dichotomous= only 2 mutually exclusive categories (dead/alive, male/female)
- multi-category= mutually exclusive categories, bearing no relationship to each other (married, engaged, single, divorced) - Ordinal= numbered
- order inherent, but not quantified
- can assume non-parametric
Quantitative data (numerical)
- Discrete
- counts (number of children, asthma attacks) - Continuous
- can have a value within the range of all possible values
(age, body weight, ht, temp)
Another term for normal distribution
Gaussian distributioni
Statistical tests to describe samples with different data samples: categorical, quantitative
Categorical= mode, frequency
Quantitative=
1. Non-normal distributed-> median, range
2. Normally distributed-> mean SD
Advantages and disadvantages of median
Adv: robust to outliers Dis: does not use all data not easy to manipulate mathematically
Explain interquartile range
Median = 2nd quartile (50%)
First quartile is at 25%
Third quartile is at 75%
So interquartile = 3rd quartile- 1st quartile
the midpoint of a perfect ND also represents
Mean value of the concerned parameter in the population
SD from mean, values contained, adv and disadv
1 SD = 68 & of values 2 SD (1.96) = 95% 3 SD = 99% (2.?58)
SD= calculated as square root of variance
Variance is the sum of all differences between all the values and the mean, squared and divided by total number of observations = 1 (degree of freedom)
Adv:
uses all the data, when distribution normal, mean and SD summarise entire distribution
Dis:
vulnerable to outliers, not useful for skewed data
Standard error
SE of a mean is an estimate of the SD that would be obtained from the means of a large number of samples from that population
If we measure ht from sample of population, calculate mean. Get another sample from people of same population, and measure hts of people, calculating another mean. Unlikely to be the same as before. Can carry on recruiting samples and calculating means. Series of means. If we calculate the mean heights for each samply, the plot the frequency of means, end up with ND. Mean is population mean. Spread of observations around population mean is known as SE.
Confidence intervals
Tells us the range within which a true magnitude of effect lies with a certain degree of assurance, usually 95%
CI for population mean=
mean +/- 1.96 x SE (SD/sqRn))
Positive and negatively skewed data
Positively skewed has longer tail to the right
Negatively skewed has longer tail to the left
Null hypothesis
There is no difference, no association between two or more sets of data
Any observed association can occur by chance. The probability of results occuring by chance can be calculated. If unlikely to be due to chance, the null is rejected
Alternate hypothesis
Experimental hypothesis
Probability
Likelihood of an event occurring as a proportion of the total number of possibilities
Expressed as P values
P values
Probability of getting the observed results or more extreme, given a true null hypothesis
Significant= result deemed unlikely to have occurred by chance, thus rejecting the null
<0.05 probability to obtain result by chance is <1 in 20
<0.1 by chance <1 in 10
<0.5 chance 1 in 2
Statistical significance is very sensitive to sample size and study power
Comparing statistical significance and clinical signifiicance
Statistical significant tells use whether results are due to chance
Clinical significance tells us whether the results are worthwhile or even noticeable
One tailed and two tailed significance testing
1-tailed examines in only one direction, ignoring the other
In 2 tails examines both directions
Type 1 error
Null hypothesis rejected when in fact true= false positive
Usually attributable to bias or confounding
Avoided by using stats to generate value
P value <0.05 signifies null can be rejected. >0.05 null cannot be rejected, therefore minimising type 1 error
Significance level a= pre chosen probability
P value = probability of making type 1 error
Not affected by sample size
More likely with increasing number of tests/end points
Type 2
Null hypothesis is accepted when it is in fact false= false negative
Usually because sample size not big enough
Probability of type 2 error= b
B depends on sample size and alpha
B gets smaller as sample size gets bigger
B gets smaller as the number of tests of end points increases
Power
Probability that a type 2 error will NOT be made in that study
A power of 0.8 generally accepted= probability of finding a real difference when one truly exists
Probability of rejecting the null hypothesis when true difference = 1-B
Unpaired and paired data
Unpaired comes from two different groups/subjects
Paired data comes from the same subjects at different times
To identify type of statistical test to use
- Consider if descriptive, comparing two groups or comparing >2 groups
- Then consider if categorical, non-normal or normal
- Paired or unpaired
Therefore: 1. Descriptive- categorical= mode, freq non-normal= median, inter-quartile normal= mean, SD
- Comparing two groups
categorical= chi square for large sample, fisher’s exact for small
non-normal= Mann-Whitney U (unpaired), Wilcoxon’s rank sum (paired)
normal= students T, either paired or unpaired - Comparing >2 groups
categorical= chisq (unpaired), McNemars (paired)
Non-normal= Kuskal-Wallis ANOVA (unpaired), Friedman (paired),
normal= ANOVA (paired or unpaired)
Contingency table
Categorical
Row= treatment groups, exposure
Columns= outcomes/disease status
What would be the appropriate test for comparing 2 groups of unpaired data that is not normally distributed
Mann-Whitney U
What would be the appropriate test for comparing 2 groups of paired data that is not normally distributed
Wilcoxon’s rank sum test
What would be the appropriate test for comparing > 2 groups of paired categorical
McNemar’s
What would be the appropriate test for comparing more than 2 groups of unpaired data that is not normally distributed
Krushkal willis ANOVA
Measuring BP in a subject before and after is an example of paired
Yes
What would be the appropriate test for comparing 2 groups of unpaired categorical data
Chi square
Risk definition
Risk has the same meaning as probability
Probability is the number of times we believe it is likely to occur divided by the total number of events possible
For exposure +
EER= a/a+b
For control -
CER= c/c+d
Absolute risk reduction
CER-EER
Absolute risk difference is the absolute change in risk that is attributable to experimental intervention
Can range from -1 to +1
RR or Risk ratio
Ratio of risk in experimental to risk in control
RR= EER/CER
Assuming outcome is undesirable
If RR= 1, experimental as likely as control
>1 more likely in experimental
<1 less likely in experimental
Relative risk reduction
Proportional reduction in rate of outcomes between experimental and control
RRR= CER- EER / CER
NNT
Number needed to be treated compared with control, for one subject to experience beneficial effect
NNT= 1/ARR (CER-EER)`
Why might absolute be valuable when given relative only
Relative can be misleading
Odds
The odds of an event is the ratio of number of times we believe it is likely to occur divided by the number of times it is likely NOT to occur
Someone expecting a baby-
Probability (risk) of it being a girl= 1/2 = 50%
Odds of it being a girl= 1/1, it is as likely to be a girl, as it is not to be a girl
Odds ratio
Odds of the event occuring in one group divided by odds of event in another group
OR= ad/bc
a/b /c/d
In case control:
exposure is often presence or absence of risk factor, and outcome is disease presence of absence
OR 1= same outcome rates
OR >1 estimated likelihood of developing disease is greater in exposed than not exposed
OR<1 likelihood of disease is less in exposed than unexposed
When do risk ratios and odds ratios differ
In general OR will always be further from the point of no effect, where OR = 1, RR= 1
If the event rate increases in treatment group, OR and RR will both be >1, OR>RR
If event decreases in treatment group both OR and RR will be <1 (OR
Odds of cases in exposed
a/b
Odds of cases in non-exposed
c/d
What does correlation tell us
How strong the association between variables is
Describe scatter graph
Compare data on two variables
Positive correlation= on graph points will slope from bottom lef to upper right
Negative correlation= on graph from upper left to lower right
Can be quantified by r= correlation coefficient
Correlation co-efficient
If r +ve- directly correlated, var 1 increases, var 2 increase
If r = -ve, inversely correlated
as var 1 increases, var 2 decreases
The closer to -1 or +1, the more strong the correlation
R does not correlate with the gradient of the line, rather how close the points fall in line
correlation coefficients used depends on the type of data used: categorical, non-normal or normal
Types of correlation co-efficients
Pearson’s= quantifies relationship between 2 continuous variables, normally distributed
Spearman’s rank= non-normal, when r is calculated using ranks. for two categorical ordinal variables, one continuous normally distributed variable and one categorical or non-normally distributed
Kendall’s correlation (Tau)= used for two categorical or non-normally distributed
Do not establish causality
Regression
Used to find out how one set of data relates to another
Regression line gives relationship between variables, on a scatter graph
Simple linear regression
Straight line that explains relationship between x and y data sets, so for a given value of x, a y value can be predicted
Y= outcome variable (dependent) x= independent variable a= intercept if the regression on the y axis b= regression coefficient, slope of the lie, gives strength of association
Multiple linear regression
Regression model in which the outcome variables is predicted from two or more independent variables. The independent variables may be continuous or categorical
If researchers knows outcome likely to be affected by one or more confounders, not eliminated from sampling, multiple linear regression may be used
Logistic regression
When the outcome variable Y is binary
Proportional cox regression
Proportional hazards ratio, is used to assess survival or other time related event
Factor analysis
This is used to analyse the interrelationships between a large number of variables, and can be used to explain these variables in terms of underlying factors
Cluster analysis
Multivariate analysis technique that tries to organise information about variables so that relatively homogenous groups, clusters can be formed
ANOVA multivariate extensions
ANCOVA= analysis of covariance, similar to multiple regression MANOVA= multiple analyses of variable. Multiple dependent variables, multiple hypotheses testing MANCOVA= multiple dependent and independent variables
Critical appraisal in aetiological studies: case-control, cohort
- Methodology
clearly defined group?
except for exposure/outcome studied, were groups similar?
- selection bias, matching, restriction criteria, randomisation
Did the exposure predict the outcome?
- recall in cohort is issue
was the follow-up complete and of sufficient duration?
- attrition bias
- power calculations
- if too many drop outs, Type 2 error may occur
were the exposures.outcomes measured in the same way in both groups?
2. Results what is the RR or OR? what is the confidence limit of the estimate NNT/NNTH dose -response gradient? association make biological sense?
3. Applicability my patients similar to target? risk factors similar? patient's risks of adverse outcomes should exposure to risks be stopped or minimised?
Critical appraisal in diagnostic tests
- Methodology
?was the test applied to appropriate spectrum of patients
were the diagnostics test results compared to gold standard
was the comparison with the gold standard test blind and independent - Results
is the new test valid?
Is it reliable?
what was the outcome when patients underwent the new and gold standard test - Can I use this study for caring for my patients
- test acceptable, available, affordable, accurate, precise in this setting
- consequences of the test help your patient
Contingency table for diagnostic studies
Rows= test + / - Columns= outcome by gold standard (disease present/absent)
a= true + b= false + c= false - d= true -
Sensitivity Specificity PPV NPV LR +ve LR-ve Pre-test and post-test probabilities and odds
Sensitivity
Proportion of subjects with disorder who have positive result
a/a+c (positive by gold standard)
True positive
Sensitive test when Negative rules Out disorder
SnOut
Sensitive test for screening
Specificty
Proportion of subjects without disorder who have a negative= true negative
D/b+d (negative by gold standard)
Sensitive test when Negative rules Out disorder
Specific test when Positive rules In disorder
Specific test for diagnosis
Generally what is pre-test probability
Prevalence
Only put a person through diagnostic test if a positive result will be greater than pre-test probability
PPV
Proportion of subjects who have a positive result with the disease
same as post - test probability of a positive result
Positive test= a+b
PPV = a/a+b
Want PPV to be substantially higher than pre-test probability
NPV
proportion of subjects with negative result, who do not have the disorder
NPV= d/c+d
LR +
for a positive result
How much more likely is a positive test to be found in a person with , as opposed to without, the condition
sensitivity/1-Specificity
True positive/1-true negative
LR -
for a negative result
How much more likely is negative test to be found in a person with, as opposed to without, the condition
1-sensitivity /specificity
when < 1 means, negative test more likely to come from someone without the disease
Pre-test probability
a+c/a+b+c+d
Probability that a subject will have the disorder before the test
Pre-test odds
Odds subject with have the disorder before the test
Pre-test probability/1-pre-test
Post-test odds
Odds that subject has disorder after test
Pre-test odds x LR for +ve
Post-test probability for + results
Probability subject will have disorder after test
Post-test odds/post-test odds+1
Does sensitivity and specificity depend on prevalence
No
PPV and NPV depend on prevalence/
Yes, will change as disorder becomes rarer in the population
PPV will decrease
NPV will increase
Post-test prob will also change
Post test probability of negative test
not the same as NPV (probability of disorder being absent in negative test)
PTP -ve= probability of disorder being present in those with negative result
NPV + PTP = 100%
therefore PTP -ve = 1-NPV
Serial testing and relationship between sensitivity and specificity
leads to increase in specificity and decrease in sensitivity
useful when treatment for disorder is hazardous and inappropriate treatment costs need to be reduced
Receiver operating characteristic curve
The closer the line to top left hand corner, the better the performance of the test will be= true positive high, false positive low
Line of unity- a test that is no better than chance at discriminating individuals with or without disease lies on line of unity
Plots true positive (sens) versus false positive (1-specificity)
The larger the area under the curve, the better the test is
Area of 1= perfect, 0.5 = worthless
Critical appraisal for treatment studies
- Methodology
clearly focused clinical question and primary hypothesis?
randomisation process clearly described?
concealed allocation?
groups similar at the start of the study?
groups treated equally apart from the experimental intervention?
blinding used effectively?
trial of sufficient duration?
follow up complete?
intention to treat study?
2. Results CER EER ARR RRR RR OR NNT Precision of the estimate of treatment effect- confidence limits
- Applicability
pts similar to target population?
were all the relevant outcome factors considered?
will the intervention help your patients?
benefits worth the risks and costs?
patient’s values and preferences been considered?
what alternatives are available?
Prognostic studies
looks at prognostic factors and the likelihood that different outcome events may occur
Most are
- Cohort
- most prognostic
- one or more followed up to see who develops the outcome
- groups may classified according to the presence or absence of prognostic - Case-control
- group with outcome are compared with a group who do not have the outcome, for the presence of prognostic factors
Prognostic factors vs risks
Prognostic factors are a characteristic of the patient
RF increase the probability of getting a disease
PF predict the course and outcome of a disease once it has developed
Critical appraisal for prognostic studies
- Methodology
was the sample clearly defined?
sample population recruited at common point in the course of the disease? selection bias?
was there adjustment for important prognostic factors>
was the follow up duration sufficiently long and complete>
was there blind assessment of objective outcome criteria?
2. Results AR/odds RR/OR Survival analysis, survival curves Precision of prognostic estimates- confidence limits
Survival analysis, disadv
Time between entry into a study and a subsequent occurrence of an event.
Technique used in longitudinal cohort studies, in which one interested in the time interval until an outcome occurs
Disadv:
likely not normally distributed
unequal distribution periods
people leave the study early, and be lost to follow up
Kaplan-Meier survival analysis
Looks at event rates over a study period, rather than a specific time point
data presented in life tables and survival curves
The survival curve will not change at the time of censoring, but only when the next event occurs
median survival time
time taken until 50% of the population survive
survival time
time fron entering into the study to developing the endpoint- time to relapse, time to death
survival probability
probability that an individual will not have developed an end point event over a given time duration
Log rank test
compare medical survival times to see any significance
endpoint probability
1- survival probability
cox regression (cox proportional hazards)
method for investigating the effect of several variables upon a time specified event takes to happen
assumes the effects of the predictor variables upon survival are constant over time and are additive in one scale
positive coefficient indicates a worse prognosis
negative coefficient represents a better prognosis
Hazard
instantaneous probability of an end point event in a study
degree of increased or decreased risk of a clinical outcome due to a factor, over a period of time, with various lengths of follow-up
Hazard ratio
comparison of hazards between two groups
<1 not statistically significant= factor decreases risk of death
>1 statistically significant= increased risk of death
4 key steps to systematic reviews
- Specifying the question
- type of study, subjects, inclusions, exclusions, intervention/exposure, outcome - Identifying studies
- reproducible, unbiased, comprehensive - Extracting the data
- standardised proforma, study methodology details, assessment of study quality - Interpreting the results
- fixed or random effects, publication bias, heterogeneity
Why should a meta-analysis be done
Large sample size
Increases power
Reduces risk of Type 2 error
Smaller confidence intervals
Weighted average
An average where the results of some studies make a greater contribution to the total than others
Large weighting when:
larger sample size
higher event rates (estimated more precisely)
Pooled result= size of combined studies
Forest plot
Axes
- vertical= list of studies
- horizontal=outcome measures, may be odds or risk ratio, means, event rates
Line of no effect- for RR= 1, OR+1
Size of box= weighting
Variability in MA between studies can be due to
- Chance
- studies have similar and consistent results, and any differences are due to random variation
- referred to as homogenous results
- as a result of similarities in design/intervention/subjects, these studies merit combination - Systematic differences
- differences between studies not due to chance
- here real differences exist between the results of the reviewed studies ever after allowing for random variation
- referred to as heterogenous results
How to determine heterogeneity
- Forest plot= if CI of studies don’t overlap, likely to be heterogeneity
- Funnel plots, quantified using Cochran’s Q, chi squared, sensitivity analysis, meta-regression
Chi squared statistic on forest plots
Keep null hypothesis in mind
Probability of differences arising from chance= P. To calculate P, chi squared is calculated for meta-analysis
Chi squared, DF, P quoted on forest plot
If P<0.05, variability is not due to chance, that is, results are heterogenous. There is some methodological difference in the way that the individual studies were carried out
Identifying heterogeneity quickly
Use statistical tables to look up P values, compare chi squared with its degrees of freedom
If statistic is bigger than DF, then there is evidence of heterogeneity
Z statistic
If Z > 2.2 results are heterogenous- null hypothesis can be rejected
Dealing with heterogeneity
If heterogeneity present, things that can be done
- use a random effects model: assumes the true treatment effects in the individual studies may be different from each other. (In homogenous, used fixed effect- every study is evaluating a common treatment effect)
- subgroup analysis
- meta regression
Approaches to publication bias
Prevention Trial registers Trial amnesty Identification Funnel plots Galbraith plot
Shape of funnel plot
If SE on y-> larger studies at funnel on bottom, smaller sudies up top
If 1/SE-> opposite is true, larger studies at the top
Asymmetry of funnel plot suggests publication bias
r2
In statistics, the coefficient of determination, denoted R2 or r2 and pronounced “R squared”, is the proportion of the variance in the dependent variable that is predictable from the independent variable(s).
Purpose for systematic review/meta-analysis
different studies can be formally compared to establish generalisability of findings and consistency of results.
reasons for heterogeneity (inconsistencies) can be identified and a new hypothesis can be generated for different subgroups
Types of selection bias in SR/MA and ways to minimise
Publication bias Language bias Indexing Inclusion Multiple publication bias
Search through databases to find relevant studies`
How are trials weighted
- Allocation concealment
- Randomisation
(1 and 2 minimise selection bias) - Blinding (measurement bias)
- ITT analysis (attrition bias)
Types of hetergeneity
- Clinical heterogeneity: introduced due to clinical differences in populations included in the study
- Statistical heterogeneity (can detect using statistics measures)
- Methodological heterogeneity
Detecting heterogeneity
- Eye balling a forest plot- if CI all overlap, no heterG
- Galbraith plot: ratio of log odds ratio to SE for each study, against recipricol of SE. If no stat sig heterogeneity, 95% will be within a band 2 units above and below overakk log ratio. 5% will be outside, just by chance
3.Stats:
Chi squared or Q test heterogeneity
df, I2 statistic calculated from cochrane Q test gives extent of heterogeneity
Methods to manage heterogeneity
Fixed effects model-> assumes all studies measuring same thing
Random effects-> studies estimating different treatment effects
Sensitvity analysis= check robustness of results, by changing parameters within the study
Data transformation-> continuous to dichotomous
Subgroup analyses
Meta-regression analyses-can test if there’s different treatment effects in different subgroups.
Cost benefit analysis
All effects measured in dollars
Adv:
easy to interpret
when NB >0= new treatments extra benefits are worth more than the extra cost
Dis:
it is difficult to measure the value of all health outcomes in dollars
?some moral objection when not able to pay
Cost UTILITY analysis
Two effect= quality and length of life
Product is taken as QALY
Adv:
outcomes involve both quality and length of life.
QALY is universal, so easily compared
Dis:
QALY measured vary by method
May vary by respondent
Society may value a QALY for different patient group differently
Cost effectiveness analysis
Adv:
One effect measured in “natural units”
incremental cost effectiveness
Dis:
Only one outcome will represent the effect of treatment, however other outcomes may be relevant
Cost-minimisation analysis
Not worried about outcomes
Adv: only need to collect cost data
Dis:
Few treatments have identical outcomes
Researchers likely need to collect the effect data to verify the equal effect assumption
Advantages and limitations of economic analysis
Adv:
Systematic evaluation of costs and consequences
Answers the question:
Is the intervention worth the cost, including if it is cheaper than the other comparator, decision makers can render judgments
Better advocacy in health care
Makes decision making explicit
Can be used to guide priority setting
Limitations:
The primary limitation of summarising cost and consquences in an ICER ‘price tag’ is that decision makers, assuming they find the economic evaluation useful, may still decide whether the extra gain associated is worth the extra cost.
Does not explicitly consider decision makers budget
May over simplify health decisions
CEA is only done when, and calculation
When the intervention is either more expensive and more effective, or less expensive and less expensive
ICER= C1-C1/E1-E2
All should have sensitivity analysis done, to test the extent to which changes in the parameters used in the analysis may affect the results obtained
An intervention results in a patient living for an addition 4 years , rather than dying within one year, but where QOL reduced from 1 to 0.6 will generate
- 4 x 0.6= 2.4
- less 1 year @ reduced quality= 1-0.6= 0.4
- QALY’s generated = 2
Advantages and disadvantages of QALY
Adv:
combine estimate of extra quantity and quality of life provided by the intervention in one measure
Can compare interventioins or programs in same therapy area
Making healthcare decisions and allocation of resources
Setting priorities with respect to healthcare interventions
Disadvantages:
Values assigned may not reflect values of patients receiving treatments
May lack sensitivity within disease area
May be derived from population studies that may not be generalisable to the population you are treating
20-40,000 fir 1 QALY or one DALY averted, usually considered cost effects
How to calculate incremental net benefit
Cost effectiveness analysis
INB:
(extra effect x willingness to pay) - extra cost
Interpreting CEAC
represents uncertainty in cost effectiveness analysis
Indicates the probability that the intervention is cost effective as compared with the alternative.
Bootstrapping
Commonest method used to construct CEACs
Constructing CI and the visually representing it as a CEAC
Methods in qualitative research
- Grounded theory- used to develop a theory, ‘grounded’ in the groups observable experiences
- Phenomenological approach- to gain a better understanding of everyday experiences of a group of people
- Ethnographic approach- learns about culture by observing people from that culture
Data sampling methods in qualitative research
- Purposive: purposefully selecting wide range of informants to explore meanings and also select key informants with important sources of knowledge
- Theoretical sampling- type of purposive, developing a theory or explantion guides the process of sampling and data collection. Analyst makes initial sample, codes, collects and analyses data, and produces a preiliminary theory, before deciding which further data to collect
- Convenience sampling
- Snowball- target populations are elusive, participants asked to identify others with direct knolwedge relevant to the study
- Extreme case sampling- participants chosed because of their knowledge or experience is atypical or unusual in come way relevant to the study being conducted
Data collection in qualitative, triangulation
- Interviews
- Focus groups
- Participant observation
Triangulation-> multiple data gathering techniques or multiple sources
1. Investigation
data
method
Data saturation
despite further data gathering and analysis, understanding is not developed further. At this point, data collection and sampling ends
Data analysis in qualitative
- Meaning focused: code relevant themes within data. Understand experiences and meaning
- Discovery focussed: analysis of segments of text, coded, sorted and organised, looking for patterns or connections
- Constant comparison: test and re-test
Clear transparent process= audit trait
Minimising bias in qualitative
- Transparence
- Bracketing (exclusion of preconceptions)
- Reflexivity: researchers aware of own preconceptions. Reader can weigh researcher’s role in the conduct of the study
- Member checking: researcher returns to one of more participants to check the researcher’s interpretations of what the participants have said