Critical numbers🔢 Flashcards
Categorical variables can be…
Binary, ordinal and nominal
Numeric variables can be…
Discrete or continuous
What is binary data?
Only two categories e.g positive/negative
What is ordinal data?
Categories with a natural order e.g stage of cancer
What is nominal data?
Categories with no natural or universally agreed order e.g blood group
What is discrete/count data?
Observations can only take certain numerical values e.g number of children
What is continuous data?
Observations can take any value within a range e.g age/height
What happens where continuous variables are categorised
The variable type switches from continuous to ordinal e.g age in years into age categories
Frequency definition
How often an event occurs in a population group at risk
What is the term for the number of existing cases in a population at a defined timepoint?
Prevalence
What is the term for the number of new cases in a population over a defined period?
Incidence
What is prevalence dependent on?
The incidence and duration of the event
True or false: the term risk can be used to quantify both desirable and undesirable outcomes
True
How do you calculate the proportion?
The number experiencing the event divided by the total (scale 0 to 1) e.g three type I diabetics in a sample of 1000 participants = 3/1000 = 0.003
How to calculate percentage from proportion?
Often x100 and reported as a percentage (scale 0 to 100%) e.g 0.003 x 100 = 0.3% of the sample had type I diabetes
How to convert the proportion to the number per quantity of people?
Multiply by the number of participants e.g 0.003 x 1000 = 3 cases per 1000 participants
How to calculate rates from number per quantity of people?
Divide it by the length of time e.g 10 deaths per 1000 people per year
How to calculate odds?
The number or proportion with an event divided by those without the event. e.g The odds of having type I diabetes in the previous example were
3/997 = 0.003 (using the actual participant counts)
0.003/0.997 = 0.003 (using the proportions)
What is the term for the difference in proportions between groups (subtraction)?
Risk difference
What is the term for the risk in one group divided by the risk in another group (division)?
Risk ratio AKA relative risk
What is the term for odds in one group divided by odds in the other (division)?
Odds ratio
What will the risk difference be when there is no difference?
Zero
What will the odds and risk ratios be if there is no difference?
1
What do risk/odds ratios >1 indicate?
A higher risk/odds in the group of interest
What do risk/odds ratios <1 indicate?
A lower risk/odds in the group of interest
How to calculate numbers needed to treat and what does this mean?
1/risk difference = x
This means x patients at risk would need to be treated with aspirin to prevent 1 additional case of hypertension.
How to calculate risk reduction?
Calculate the risk ratio then take away 1 from it and multiply by 100 to get the % reduction in risk
Why do we need both risk and odds?
Need it for probability
Odds = Probability/(1 − risk)
Odds have symmetry so Y is the inverse of the odds for the outcome not-y
Defining features of the ecological study design
Population-level data
Advantages of ecological study design
· Can look at trends between regions or over time
· Data collection fairly easy (tends to be routinely-collected)
· Fast
· Inexpensive
· Few ethical issues
Disadvantages of ecological study design
· Cannot determine individual-level associations (ecological fallacy)
· Cannot demonstrate cause and effect
· Lack of control over variables collected
Defining features of cross sectional study design
Outcome and exposure status measured simultaneously
Advantages of cross-sectional study design
· Can look at associations – hypothesis-generating
· Data collection fairly easy
· Can study multiple exposures and outcomes
· Fast
· Inexpensive
Disadvantages of cross sectional study design
· Cannot demonstrate cause and effect
· Prone to bias and confounding
· Not useful for rare exposures or outcomes
Defining features of case-control study design
Participants selected on the basis of outcome
Advantages of case control study design
· Can look at association between outcome and prior exposures
· Fast
· Inexpensive
· Good for studying rare outcomes
· Can study multiple exposures
Disadvantages of case-control study design
· Cannot determine incidence/risk of outcome
· Limited control over data quality – poor historic records or recall bias
· Retrospective nature limits ability to determine causality
· Not useful for rare exposures
Defining features of cohort study design
Participants selected on the basis of exposure
Advantages of cohort study design
· Can look at incident cases and associations with exposure
· Good for studying rare exposures
· Can study multiple outcomes
· Control over data collected
· Exposure determined before outcome occurs so can demonstrate temporality for potential cause and effect
Disadvantages of cohort study design
· Mostly prospective which can be time consuming
· Risk of loss to follow-up
· Expensive
· Not useful for rare outcomes
Defining features of randomised control trials
Participants randomly allocated to interventions then followed up to compare outcomes
Advantages of randomised control trials
· Can study intervention effects on outcome(s)
· Random allocation means confounding factors should be evenly distributed
· Control over variables collected
· Comparator group means ability to account for placebo/temporal effects
· Less prone to bias, particularly where blinding and objective outcome assessment used
· Gold standard for establishing causality
Disadvantages for randomised control trial
· Time consuming
· Expensive
· Require expertise to run
· Can only be used where ethics and participant willingness permit randomisation to intervention
· Overly strict eligibility criteria may render sample not fully representative of population
What does PICO stand for?
Population
Intervention
Comparator
Outcome
What is strength of association?
The stronger an association, the more likely it is to be causal
What is consistency?
Association shown across different studies in different locations, populations, using different methods, etc
What is specificity
specific exposure-outcome relationship, e.g. asbestos and asbestosis
What is temporality?
– exposure must precede outcome
What is a biological gradient?
dose-response, i.e. increase in exposure = increase in outcome
What is plausibility?
biological mechanism that would explain outcome development
What is coherence?
compatible with existing theories
What is experiment
Outcome altered with experimentation, e.g. reversible
What is analogy?
Similar cause-effect relationships established
What is central tendency
The average or typical values
What is Dispersion?
How spread out the data are around these values
What is range?
What the minimum and maximum observed values are
Different types of average calculations
Mean, median and mode
What is standard deviation?
The dispersion around the mean
What average calculation is reported with what ?
Mean reported with standard deviation
Median reported with a central range
How to calculate standard deviation?
Calculate difference between each observation and the mean
Square them (to make them all positive values)
Sum them
Divide by the number of observations minus 1
Take the square root (to reverse the earlier squaring)
What is variance?
Standard deviation squared
What is the range?
The lowest value to the highest value
What are centiles?
The median is the 50th centile. We can describe spread using centiles around that, e.g. 5th to 95th centile gives the 90% central range.
What is the interquartile range (IQR)?
The 25th to the 75th centile, which gives the central 50% range.
What is normal distribution?
AKA Gausian distribution or Bell shaped curve
Standard deviation gives us info on the shape of the distribution
What is skewed distribution?
This sample has the same mean but the median is lower
Mean pulled out to the extremes
It allows us to inspect it
Usually report median and centiles with it
Why do we not just always use the median?
Sometimes need the
What is a parametric statistical model
Make distributional assumptions
What is Non-parametric statistical model?
Make no assumptions (distribution free)
sort this out
Symmetric (mean, median and mode are equal)
The empirical or 68–95–99.7 rule:
68% of values lie within 1 SD of the mean
95% of values lie within 2 SD of the mean
99.7% of values lie within 3 SD of the mean
What do we compare between groups of numeric data?
Compare measures of central tendency
E.g mean of one vs mean of another
What is correlation?
A measure of linear relationship between variables
Quantified by the correlation coefficient r
What does the value of r show?
r is bound between -1 and 1
The closer to |1|, the stronger the correlation
The closer to 0, the weaker the correlation
Can be positive or negative correlation
What is hypothesis testing?
We can perform a statistical test to determine how likely the result we have observed is ‘real’
Or if it is more likely there is in fact no true difference and we are just seeing chance sampling variation
To do this we test the hypothesis of no difference between groups
We then weigh up the strength of the evidence against that hypothesis
And decide if we should reject that hyothesis
Probability
But first, let’s recap probabilities…
Probability values range from 0 to 1
(though as you’ve seen we often x100 to express as a percentage)
A probability of 0 means an event is impossible
A probability of 1 means an event is certain
So the smaller the probability the less likely the outcome
The probability of heads from a fair coin toss is 1/2 = 0.5
The probability of rolling a 3 on a die is 1/6 = 0.17
The probability of 7 from a single roll is 0/6 = 0
Step 1 of hypothesis testing
Define the null hypothesis:
This is typically the theory we want to disprove
The cynic’s belief
“The null is dull”
We will assume this hypothesis is true until we see sufficient evidence to the contrary
Denoted H0
In our example:
H0 = no difference in mean IQ between groups
Step 2 of hypothesis testing
Define the alternative hypothesis:
This is the opposite theory to the null
Denoted HA or H1
In our example:
HA = there is a difference in mean IQ between groups
Using statistical notation…
H0: μSHEF – μMU = 0
HA: μSHEF – μMU does not equal 0
Describe step 3 of hypothesis testing
Choose a significance level for the test:
This is how we determine whether our result is statistically significant
It is also the probability we make a false positive conclusion and reject the null hypothesis when it is in fact true
So we need to minimise this risk
Typically it is set around 0.05 (so 5%)
Step 4 of hypothesis testing
Perform an appropriate statistical test:
Panic not - we won’t be performing these tests by hand!
But for information… we use the sample data from our two groups to calculate a test statistic
This effectively reduces all of our data down to a single value
We then compare that test statistic against the distribution we would expect under the null hypothesis and work out the probability of our result if the null were true
Hypothesis tests are basically a comparison of that which we observed versus that which we expect under the null
Some examples of statistical tests (journal articles should state which test has been performed)
For numeric outcomes: t-test, ANOVA/ANCOVA, linear regression
For categorical outcomes: Chi-squared test, logistic regression
Step 5 of hypothesis testing
Decision time:
We use the probability value from the statistical test to weigh up the strength of the evidence against the null hypothesis
We call this probability value the p-value
The p-value is the probability of seeing an effect of the observed magnitude or greater if the null hypothesis were true
If the p-value is high the result is probable under the null hypothesis… so it is likely the null hypothesis is true
The smaller the p-value, the less likely it is we would see our observed result under the null hypothesis
If the p-value is smaller than our significance level (so < 0.05 in our example) we reject the null hypothesis and declare the result statistically significant
Example of step 5
In our example we had a mean difference in IQ of 9 points
We performed a statistical test on the difference in means
The p-value from that test was p = 0.003
This means the probability of seeing a difference of 9+ points on the IQ scale under the assumption the two groups are the same is 0.003 (or 0.3%)
So pretty unlikely
0.003 is lower than 0.05 so we reject the null hypothesis of no difference
Our conclusion would be there is evidence to suggest University of Sheffield students have higher average IQ than MadeUp University students
When to use standard deviation vs standard error
The standard Deviation is for Describing
The standard Error is for Estimating
What is standard error
The standard error indicates how different a sample mean is likely to be from the population mean
It tells us the precision of estimation
The smaller the standard error of the mean, the more precise our estimate of the mean
i.e. the closer it is likely to be to the true population mean
We estimate the standard error using our sample size (n) and standard deviation (SD)
Standard error of the mean = SD/√n
How is precision affected
This means our precision is affected by these two things: how variable our data are (the SD) and how large our sample is (n)
This makes sense because the less variable the data are, the more precise our estimation.
The more people we sample, the better the representation and therefore the more precise our estimation.
What is a confidence interval?
Clinical importance of statistical significance
Just because a result is statistically significant doesn’t mean it is clinically important!
Statistical significance just means an observed result is unlikely under the null hypothesis
Clinical importance means the result is practically important/meaningful in the real world
A very large study may find a very small effect to be statistically significant… but is it meaningful?
e.g. a reduction in flu symptoms from 72 to 70 hours? Meh
a reduction from 72 to 24 hours? Noticeable improvement
The magnitude of the effect itself is important and we can use confidence intervals to help judge clinical importance
What is correlation?
Quantifies the linear association between two numeric variables
Variable order doesn’t matter (correlate x~y or y~x)
What is regression?
Allows one variable to be predicted from the other
Order matters (predict y from x)
Can handle multiple predictors (predict y from x1, x2, x3…)
Variables don’t have to be numeric
Regression equation
y= mx+ c or y=a+ bx
y and x just represent the variables we are looking at
a and b explain the linear relationship between them
What does y represent
Variable being predicted
The “response”
or “outcome”
or “dependent” variable (its value depends on x)
what does a represent
The intercept of the regression line
Also known as the ‘constant’
This is the point at which the line crosses the y axis when x = 0
What does b represent?
The regression coefficient
The slope (gradient) of the regression line
The larger the value the steeper the slope
The sign indicates the direction
of effect
It is the change in y associated with a unit change in x
what does x represent
The “predictor”
or “explanatory”
or “independent” variable
It is the variable we are using to predict y
What is multiple regression
More than one predictor
For example, FEV1 is typically predicted from characteristics such as height AND age AND sex AND ethnicity
We can extend our regression model
y= α+β1x1+ β2x2 + …
FEV1=a+b1height+ b2age +
Modelling multiple variables together is:
More realistic
More efficient
More accurate
We can include multiple predictors, categorical or numeric, and create more useful models.
Advantages of multiple regression
We can adjust or control for the effects of other variables. Remember confounding from lecture 1? Incorporating multiple variables in a model means we can adjust our variables of interest for the effects of potential confounders.
We can analyse the simultaneous effects of multiple variables on an outcome and look for independent predictors or interaction effects
We can make predictions based on combinations of risk factors – this is essential in clinical prediction modelling
Regression equation for an RCT
Outcome at follow-up = a + b₁(outcome at baseline) + b₂(sex) + b₃(age) + b₄(treatment)
The b₄ coefficient tells us the adjusted treatment effect, e.g. adjusted between-group difference in means.
What is prognostic modelling?
Prognostic modelling uses advanced regression techniques to predict the risk of illness or future course of illness for an individual based on their individual combination of clinical and non-clinical characteristics
Move towards stratified medicine
Informs clinical decision-making
Healthcare professionals use statements about prognosis to:
Inform patients and families about likely future outcomes
Guide decisions regarding course of treatment
Internal validity
What do we want to know when doing critical appraisal?
Did the study address clear research question(s)?
Were appropriate methods used to answer question(s)?
Are the results valid? Low risk of bias? Correctly interpreted?
If so, what are the potential implications for practice?
So we need to consider all aspects of a study:
Question, design, conduct, analysis, interpretation, reporting
Questions needed to critically appraise an articles
Is the research question clear and focussed?
Is the study design appropriate?
- Is the research question clear and focussed?
- Is the study design appropriate?
- What are the strengths/weaknesses?
- Is there potential for bias? Has it been addressed?
- Are the analysis methods appropriate?
- Have the results been interpreted correctly?
- Are the findings relevant to your practice?
How to appraise the question
Did the study address a clearly focussed research question/hypothesis?
From the information provided can you identify the PICO/PECO elements?
Population
Intervention or Exposure
Comparator
Outcomes
Appraising the design and conduct
Is the design appropriate given the research question?
Strengths/limitations? Where does it sit in the hierarchy of evidence?
Was the methodology sound? The design may technically be high in the hierarchy but if not well executed, the strength of the evidence may still be low.
E.g. We put RCTs above cohort studies in the hierarchy of evidence
But if an RCT is very poorly conducted and has extremely high risk of bias…
Would we consider it better than very well executed prospective cohort studies?
further appraisal of design and conduct
Some design aspects to consider are relevant to all studies, e.g.
Is the sample representative of the population of interest?
Have potential sources of bias/confounding been addressed?
Were clearly defined, objective outcome measures used?
Other aspects will differ by study design, e.g. transparency regarding non-compliance or loss to follow-up in an RCT; control selection in observational studies.
If the study was registered prior to being conducted (e.g. published protocol) did the final publication reflect the original plan? If not, were deviations transparently explained?
If a study is well-reported there should be sufficient information to enable replication (or at least references/links provided to said information).
Reporting guidelines
Appraising descriptive statistics
Are all recruited participants accounted for?
Are summary statistics well reported, e.g. appropriate measures of central tendency reported with measures of dispersion?
Were the authors transparent regarding missing or incomplete data? (Look for Ns on tables, flowcharts and graphs or in text)
Other signs of poor reporting, e.g. unclear units of measurement, mislabelling of figures, inconsistent/truncated axes, etc.?
Multiple hypothesis tests on descriptive data?
Appraising inferential statistics
Are results given for all outcomes (if not in main body, in supplementary material or another paper referenced)? Any evidence of ‘cherry picking’?
Any inappropriate data manipulation, e.g. arbitrary categorisation of continuous outcomes or questionable assumptions made?
Were the statistical tests appropriate given the type of data and the research question?
If using NHST were p-values reported correctly? i.e. not just “NS” or “p>0.05”
Were confidence intervals reported?
If reporting risk, is it clear whether absolute or relative?
Any inappropriate reporting, e.g. incident risk from a case-control study?
Was sample size estimation/power calculation performed a priori?
Were assumption checks performed for statistical models? Was goodness of fit determined?
Multiplicity?
Appraising the results
What are the results?
Was the reporting comprehensive?
Are the results believable?
Do they fit with other available evidence?
(Can apply Bradford Hill criteria here)
Do the design and analysis allow for the conclusions drawn, e.g. causality?
Are the results clinically/practically important? Focus on effect size and precision rather than just statistical significance.
Can the findings be used to inform practice?
How is scoping or narrative reviews different to systematic reviews?
- They also summarise available research on a given topic
- The question may be broader
- They do not necessarily follow such strict, standardised, transparent methodology
- They are therefore less rigorous, more subjective and can be more prone to selection bias
What is a systematic review?
They also summarise available research on a given topic
The question may be broader
They do not necessarily follow such strict, standardised, transparent methodology
They are therefore less rigorous, more subjective and can be more prone to selection bias
What are the steps of a systematic review?
1) Specify research question (and check recent systematic review doesn’t already exist!)
2) Develop search strategy and inclusion/exclusion criteria
3) Identify relevant studies
4) Assess quality and risk of bias
5) Extract results from each study
6) Pool results
7) Answer research question
(Potential final step: update review at a later date if further primary evidence becomes available)
Outline meta analysis
- “The analysis of analyses”
- Statistical method for combining evidence from different (separate but related) sources
- In the early 1900s, Karl Pearson used a meta-analytic approach in a BMJ paper
- Methodological and computational advances later increased the use of these approaches
- Meta-analysis is often (but not always) used in systematic reviews
Outline a meta-analysis forest plot
One row per study
The point estimate is shown as a square with size proportional to the size of the study
The horizontal lines are confidence intervals
The x axis is a measure of effect- in this case odds ratio
The solid vertical line indicates the line of no effect (the null, so given these are ORs, the null value = 1)
The diamond shows the pooled estimate from the meta-analysis
What are fixed effects or random effects?
These are different approaches to meta-analysis (just for info – you won’t need to go into this level of detail).
What is heterogeneity?
A measure of variation between different studies (heterogeneous = different; homogeneous = similar).
Outline sensitivity analysis
Analysis to test the robustness of the findings of primary analysis – looks at the effect of assumptions or variations in approach.
Outline PRISMA
Preferred Reporting Items for Systematic Reviews and Meta-Analyses. These are guidelines aimed to improve the reporting of systematic reviews. First published 2009; updated 2020.