Critical numbers🔢 Flashcards

1
Q

Categorical variables can be…

A

Binary, ordinal and nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Numeric variables can be…

A

Discrete or continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is binary data?

A

Only two categories e.g positive/negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is ordinal data?

A

Categories with a natural order e.g stage of cancer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is nominal data?

A

Categories with no natural or universally agreed order e.g blood group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is discrete/count data?

A

Observations can only take certain numerical values e.g number of children

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is continuous data?

A

Observations can take any value within a range e.g age/height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What happens where continuous variables are categorised

A

The variable type switches from continuous to ordinal e.g age in years into age categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Frequency definition

A

How often an event occurs in a population group at risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the term for the number of existing cases in a population at a defined timepoint?

A

Prevalence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the term for the number of new cases in a population over a defined period?

A

Incidence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is prevalence dependent on?

A

The incidence and duration of the event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or false: the term risk can be used to quantify both desirable and undesirable outcomes

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you calculate the proportion?

A

The number experiencing the event divided by the total (scale 0 to 1) e.g three type I diabetics in a sample of 1000 participants = 3/1000 = 0.003

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to calculate percentage from proportion?

A

Often x100 and reported as a percentage (scale 0 to 100%) e.g 0.003 x 100 = 0.3% of the sample had type I diabetes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to convert the proportion to the number per quantity of people?

A

Multiply by the number of participants e.g 0.003 x 1000 = 3 cases per 1000 participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How to calculate rates from number per quantity of people?

A

Divide it by the length of time e.g 10 deaths per 1000 people per year

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How to calculate odds?

A

The number or proportion with an event divided by those without the event. e.g The odds of having type I diabetes in the previous example were
3/997 = 0.003 (using the actual participant counts)
0.003/0.997 = 0.003 (using the proportions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the term for the difference in proportions between groups (subtraction)?

A

Risk difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the term for the risk in one group divided by the risk in another group (division)?

A

Risk ratio AKA relative risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the term for odds in one group divided by odds in the other (division)?

A

Odds ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What will the risk difference be when there is no difference?

A

Zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What will the odds and risk ratios be if there is no difference?

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What do risk/odds ratios >1 indicate?

A

A higher risk/odds in the group of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What do risk/odds ratios <1 indicate?

A

A lower risk/odds in the group of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

How to calculate numbers needed to treat and what does this mean?

A

1/risk difference = x
This means x patients at risk would need to be treated with aspirin to prevent 1 additional case of hypertension.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

How to calculate risk reduction?

A

Calculate the risk ratio then take away 1 from it and multiply by 100 to get the % reduction in risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Why do we need both risk and odds?

A

Need it for probability
Odds = Probability/(1 − risk)
Odds have symmetry so Y is the inverse of the odds for the outcome not-y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Defining features of the ecological study design

A

Population-level data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Advantages of ecological study design

A

· Can look at trends between regions or over time
· Data collection fairly easy (tends to be routinely-collected)
· Fast
· Inexpensive
· Few ethical issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Disadvantages of ecological study design

A

· Cannot determine individual-level associations (ecological fallacy)
· Cannot demonstrate cause and effect
· Lack of control over variables collected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Defining features of cross sectional study design

A

Outcome and exposure status measured simultaneously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Advantages of cross-sectional study design

A

· Can look at associations – hypothesis-generating
· Data collection fairly easy
· Can study multiple exposures and outcomes
· Fast
· Inexpensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Disadvantages of cross sectional study design

A

· Cannot demonstrate cause and effect
· Prone to bias and confounding
· Not useful for rare exposures or outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Defining features of case-control study design

A

Participants selected on the basis of outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Advantages of case control study design

A

· Can look at association between outcome and prior exposures
· Fast
· Inexpensive
· Good for studying rare outcomes
· Can study multiple exposures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Disadvantages of case-control study design

A

· Cannot determine incidence/risk of outcome
· Limited control over data quality – poor historic records or recall bias
· Retrospective nature limits ability to determine causality
· Not useful for rare exposures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Defining features of cohort study design

A

Participants selected on the basis of exposure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Advantages of cohort study design

A

· Can look at incident cases and associations with exposure
· Good for studying rare exposures
· Can study multiple outcomes
· Control over data collected
· Exposure determined before outcome occurs so can demonstrate temporality for potential cause and effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Disadvantages of cohort study design

A

· Mostly prospective which can be time consuming
· Risk of loss to follow-up
· Expensive
· Not useful for rare outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Defining features of randomised control trials

A

Participants randomly allocated to interventions then followed up to compare outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Advantages of randomised control trials

A

· Can study intervention effects on outcome(s)
· Random allocation means confounding factors should be evenly distributed
· Control over variables collected
· Comparator group means ability to account for placebo/temporal effects
· Less prone to bias, particularly where blinding and objective outcome assessment used
· Gold standard for establishing causality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Disadvantages for randomised control trial

A

· Time consuming
· Expensive
· Require expertise to run
· Can only be used where ethics and participant willingness permit randomisation to intervention
· Overly strict eligibility criteria may render sample not fully representative of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What does PICO stand for?

A

Population
Intervention
Comparator
Outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is strength of association?

A

The stronger an association, the more likely it is to be causal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is consistency?

A

Association shown across different studies in different locations, populations, using different methods, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is specificity

A

specific exposure-outcome relationship, e.g. asbestos and asbestosis

48
Q

What is temporality?

A

– exposure must precede outcome

49
Q

What is a biological gradient?

A

dose-response, i.e. increase in exposure = increase in outcome

50
Q

What is plausibility?

A

biological mechanism that would explain outcome development

51
Q

What is coherence?

A

compatible with existing theories

52
Q

What is experiment

A

Outcome altered with experimentation, e.g. reversible

53
Q

What is analogy?

A

Similar cause-effect relationships established

54
Q

What is central tendency

A

The average or typical values

55
Q

What is Dispersion?

A

How spread out the data are around these values

56
Q

What is range?

A

What the minimum and maximum observed values are

57
Q

Different types of average calculations

A

Mean, median and mode

58
Q

What is standard deviation?

A

The dispersion around the mean

59
Q

What average calculation is reported with what ?

A

Mean reported with standard deviation
Median reported with a central range

60
Q

How to calculate standard deviation?

A

Calculate difference between each observation and the mean
Square them (to make them all positive values)
Sum them
Divide by the number of observations minus 1
Take the square root (to reverse the earlier squaring)

61
Q

What is variance?

A

Standard deviation squared

62
Q

What is the range?

A

The lowest value to the highest value

63
Q

What are centiles?

A

The median is the 50th centile. We can describe spread using centiles around that, e.g. 5th to 95th centile gives the 90% central range.

64
Q

What is the interquartile range (IQR)?

A

The 25th to the 75th centile, which gives the central 50% range.

65
Q

What is normal distribution?

A

AKA Gausian distribution or Bell shaped curve
Standard deviation gives us info on the shape of the distribution

66
Q

What is skewed distribution?

A

This sample has the same mean but the median is lower
Mean pulled out to the extremes
It allows us to inspect it
Usually report median and centiles with it

67
Q

Why do we not just always use the median?

A

Sometimes need the

68
Q

What is a parametric statistical model

A

Make distributional assumptions

69
Q

What is Non-parametric statistical model?

A

Make no assumptions (distribution free)

70
Q

sort this out

A

Symmetric (mean, median and mode are equal)
The empirical or 68–95–99.7 rule:
68% of values lie within 1 SD of the mean
95% of values lie within 2 SD of the mean
99.7% of values lie within 3 SD of the mean

71
Q

What do we compare between groups of numeric data?

A

Compare measures of central tendency
E.g mean of one vs mean of another

72
Q

What is correlation?

A

A measure of linear relationship between variables
Quantified by the correlation coefficient r

73
Q

What does the value of r show?

A

r is bound between -1 and 1
The closer to |1|, the stronger the correlation
The closer to 0, the weaker the correlation
Can be positive or negative correlation

74
Q

What is hypothesis testing?

A

We can perform a statistical test to determine how likely the result we have observed is ‘real’
Or if it is more likely there is in fact no true difference and we are just seeing chance sampling variation
To do this we test the hypothesis of no difference between groups
We then weigh up the strength of the evidence against that hypothesis
And decide if we should reject that hyothesis

75
Q

Probability

A

But first, let’s recap probabilities…

Probability values range from 0 to 1
(though as you’ve seen we often x100 to express as a percentage)
A probability of 0 means an event is impossible
A probability of 1 means an event is certain
So the smaller the probability the less likely the outcome
The probability of heads from a fair coin toss is 1/2 = 0.5
The probability of rolling a 3 on a die is 1/6 = 0.17
The probability of 7 from a single roll is 0/6 = 0

76
Q

Step 1 of hypothesis testing

A

Define the null hypothesis:
This is typically the theory we want to disprove
The cynic’s belief
“The null is dull”
We will assume this hypothesis is true until we see sufficient evidence to the contrary
Denoted H0

In our example:
H0 = no difference in mean IQ between groups

77
Q

Step 2 of hypothesis testing

A

Define the alternative hypothesis:
This is the opposite theory to the null
Denoted HA or H1

In our example:
HA = there is a difference in mean IQ between groups
Using statistical notation…

H0: μSHEF – μMU = 0

HA: μSHEF – μMU does not equal 0

78
Q

Describe step 3 of hypothesis testing

A

Choose a significance level for the test:
This is how we determine whether our result is statistically significant
It is also the probability we make a false positive conclusion and reject the null hypothesis when it is in fact true
So we need to minimise this risk
Typically it is set around 0.05 (so 5%)

79
Q

Step 4 of hypothesis testing

A

Perform an appropriate statistical test:
Panic not - we won’t be performing these tests by hand!
But for information… we use the sample data from our two groups to calculate a test statistic
This effectively reduces all of our data down to a single value
We then compare that test statistic against the distribution we would expect under the null hypothesis and work out the probability of our result if the null were true
Hypothesis tests are basically a comparison of that which we observed versus that which we expect under the null
Some examples of statistical tests (journal articles should state which test has been performed)
For numeric outcomes: t-test, ANOVA/ANCOVA, linear regression
For categorical outcomes: Chi-squared test, logistic regression

80
Q

Step 5 of hypothesis testing

A

Decision time:
We use the probability value from the statistical test to weigh up the strength of the evidence against the null hypothesis
We call this probability value the p-value
The p-value is the probability of seeing an effect of the observed magnitude or greater if the null hypothesis were true
If the p-value is high the result is probable under the null hypothesis… so it is likely the null hypothesis is true
The smaller the p-value, the less likely it is we would see our observed result under the null hypothesis
If the p-value is smaller than our significance level (so < 0.05 in our example) we reject the null hypothesis and declare the result statistically significant

81
Q

Example of step 5

A

In our example we had a mean difference in IQ of 9 points
We performed a statistical test on the difference in means
The p-value from that test was p = 0.003
This means the probability of seeing a difference of 9+ points on the IQ scale under the assumption the two groups are the same is 0.003 (or 0.3%)
So pretty unlikely
0.003 is lower than 0.05 so we reject the null hypothesis of no difference
Our conclusion would be there is evidence to suggest University of Sheffield students have higher average IQ than MadeUp University students

82
Q

When to use standard deviation vs standard error

A

The standard Deviation is for Describing
The standard Error is for Estimating

83
Q

What is standard error

A

The standard error indicates how different a sample mean is likely to be from the population mean

It tells us the precision of estimation

The smaller the standard error of the mean, the more precise our estimate of the mean
i.e. the closer it is likely to be to the true population mean

We estimate the standard error using our sample size (n) and standard deviation (SD)

Standard error of the mean = SD/√n

84
Q

How is precision affected

A

This means our precision is affected by these two things: how variable our data are (the SD) and how large our sample is (n)

This makes sense because the less variable the data are, the more precise our estimation.
The more people we sample, the better the representation and therefore the more precise our estimation.

85
Q

What is a confidence interval?

A
86
Q

Clinical importance of statistical significance

A

Just because a result is statistically significant doesn’t mean it is clinically important!
Statistical significance just means an observed result is unlikely under the null hypothesis
Clinical importance means the result is practically important/meaningful in the real world
A very large study may find a very small effect to be statistically significant… but is it meaningful?
e.g. a reduction in flu symptoms from 72 to 70 hours? Meh
a reduction from 72 to 24 hours? Noticeable improvement
The magnitude of the effect itself is important and we can use confidence intervals to help judge clinical importance

87
Q

What is correlation?

A

Quantifies the linear association between two numeric variables
Variable order doesn’t matter (correlate x~y or y~x)

88
Q

What is regression?

A

Allows one variable to be predicted from the other
Order matters (predict y from x)
Can handle multiple predictors (predict y from x1, x2, x3…)
Variables don’t have to be numeric

89
Q

Regression equation

A

y= mx+ c or y=a+ bx

y and x just represent the variables we are looking at
a and b explain the linear relationship between them

90
Q

What does y represent

A

Variable being predicted
The “response”
or “outcome”
or “dependent” variable (its value depends on x)

91
Q

what does a represent

A

The intercept of the regression line
Also known as the ‘constant’
This is the point at which the line crosses the y axis when x = 0

92
Q

What does b represent?

A

The regression coefficient
The slope (gradient) of the regression line
The larger the value the steeper the slope
The sign indicates the direction
of effect
It is the change in y associated with a unit change in x

93
Q

what does x represent

A

The “predictor”
or “explanatory”
or “independent” variable
It is the variable we are using to predict y

94
Q

What is multiple regression

A

More than one predictor
For example, FEV1 is typically predicted from characteristics such as height AND age AND sex AND ethnicity
We can extend our regression model
y= α+β1x1+ β2x2 + …
FEV1=a+b1height+ b2age +
Modelling multiple variables together is:
More realistic
More efficient
More accurate
We can include multiple predictors, categorical or numeric, and create more useful models.

95
Q

Advantages of multiple regression

A

We can adjust or control for the effects of other variables. Remember confounding from lecture 1? Incorporating multiple variables in a model means we can adjust our variables of interest for the effects of potential confounders.
We can analyse the simultaneous effects of multiple variables on an outcome and look for independent predictors or interaction effects
We can make predictions based on combinations of risk factors – this is essential in clinical prediction modelling

96
Q

Regression equation for an RCT

A

Outcome at follow-up = a + b₁(outcome at baseline) + b₂(sex) + b₃(age) + b₄(treatment)
The b₄ coefficient tells us the adjusted treatment effect, e.g. adjusted between-group difference in means.

97
Q

What is prognostic modelling?

A

Prognostic modelling uses advanced regression techniques to predict the risk of illness or future course of illness for an individual based on their individual combination of clinical and non-clinical characteristics
Move towards stratified medicine
Informs clinical decision-making
Healthcare professionals use statements about prognosis to:
Inform patients and families about likely future outcomes
Guide decisions regarding course of treatment

98
Q

Internal validity

A
99
Q

What do we want to know when doing critical appraisal?

A

Did the study address clear research question(s)?
Were appropriate methods used to answer question(s)?
Are the results valid? Low risk of bias? Correctly interpreted?
If so, what are the potential implications for practice?

So we need to consider all aspects of a study:
Question, design, conduct, analysis, interpretation, reporting

100
Q

Questions needed to critically appraise an articles

A

Is the research question clear and focussed?
Is the study design appropriate?
- Is the research question clear and focussed?
- Is the study design appropriate?
- What are the strengths/weaknesses?
- Is there potential for bias? Has it been addressed?
- Are the analysis methods appropriate?
- Have the results been interpreted correctly?
- Are the findings relevant to your practice?

101
Q

How to appraise the question

A

Did the study address a clearly focussed research question/hypothesis?
From the information provided can you identify the PICO/PECO elements?

Population
Intervention or Exposure
Comparator
Outcomes

102
Q

Appraising the design and conduct

A

Is the design appropriate given the research question?
Strengths/limitations? Where does it sit in the hierarchy of evidence?
Was the methodology sound? The design may technically be high in the hierarchy but if not well executed, the strength of the evidence may still be low.
E.g. We put RCTs above cohort studies in the hierarchy of evidence
But if an RCT is very poorly conducted and has extremely high risk of bias…
Would we consider it better than very well executed prospective cohort studies?

103
Q

further appraisal of design and conduct

A

Some design aspects to consider are relevant to all studies, e.g.
Is the sample representative of the population of interest?
Have potential sources of bias/confounding been addressed?
Were clearly defined, objective outcome measures used?
Other aspects will differ by study design, e.g. transparency regarding non-compliance or loss to follow-up in an RCT; control selection in observational studies.
If the study was registered prior to being conducted (e.g. published protocol) did the final publication reflect the original plan? If not, were deviations transparently explained?
If a study is well-reported there should be sufficient information to enable replication (or at least references/links provided to said information).

104
Q

Reporting guidelines

A
105
Q

Appraising descriptive statistics

A

Are all recruited participants accounted for?
Are summary statistics well reported, e.g. appropriate measures of central tendency reported with measures of dispersion?
Were the authors transparent regarding missing or incomplete data? (Look for Ns on tables, flowcharts and graphs or in text)
Other signs of poor reporting, e.g. unclear units of measurement, mislabelling of figures, inconsistent/truncated axes, etc.?
Multiple hypothesis tests on descriptive data?

106
Q

Appraising inferential statistics

A

Are results given for all outcomes (if not in main body, in supplementary material or another paper referenced)? Any evidence of ‘cherry picking’?
Any inappropriate data manipulation, e.g. arbitrary categorisation of continuous outcomes or questionable assumptions made?
Were the statistical tests appropriate given the type of data and the research question?
If using NHST were p-values reported correctly? i.e. not just “NS” or “p>0.05”
Were confidence intervals reported?
If reporting risk, is it clear whether absolute or relative?
Any inappropriate reporting, e.g. incident risk from a case-control study?
Was sample size estimation/power calculation performed a priori?
Were assumption checks performed for statistical models? Was goodness of fit determined?
Multiplicity?

107
Q

Appraising the results

A

What are the results?
Was the reporting comprehensive?
Are the results believable?
Do they fit with other available evidence?
(Can apply Bradford Hill criteria here)
Do the design and analysis allow for the conclusions drawn, e.g. causality?
Are the results clinically/practically important? Focus on effect size and precision rather than just statistical significance.
Can the findings be used to inform practice?

108
Q

How is scoping or narrative reviews different to systematic reviews?

A
  • They also summarise available research on a given topic
  • The question may be broader
  • They do not necessarily follow such strict, standardised, transparent methodology
  • They are therefore less rigorous, more subjective and can be more prone to selection bias
109
Q

What is a systematic review?

A

They also summarise available research on a given topic
The question may be broader
They do not necessarily follow such strict, standardised, transparent methodology
They are therefore less rigorous, more subjective and can be more prone to selection bias

110
Q

What are the steps of a systematic review?

A

1) Specify research question (and check recent systematic review doesn’t already exist!)
2) Develop search strategy and inclusion/exclusion criteria
3) Identify relevant studies
4) Assess quality and risk of bias
5) Extract results from each study
6) Pool results
7) Answer research question
(Potential final step: update review at a later date if further primary evidence becomes available)

111
Q

Outline meta analysis

A
  • “The analysis of analyses”
  • Statistical method for combining evidence from different (separate but related) sources
  • In the early 1900s, Karl Pearson used a meta-analytic approach in a BMJ paper
  • Methodological and computational advances later increased the use of these approaches
  • Meta-analysis is often (but not always) used in systematic reviews
112
Q

Outline a meta-analysis forest plot

A

One row per study

The point estimate is shown as a square with size proportional to the size of the study

The horizontal lines are confidence intervals

The x axis is a measure of effect- in this case odds ratio

The solid vertical line indicates the line of no effect (the null, so given these are ORs, the null value = 1)

The diamond shows the pooled estimate from the meta-analysis

113
Q

What are fixed effects or random effects?

A

These are different approaches to meta-analysis (just for info – you won’t need to go into this level of detail).

114
Q

What is heterogeneity?

A

A measure of variation between different studies (heterogeneous = different; homogeneous = similar).

115
Q

Outline sensitivity analysis

A

Analysis to test the robustness of the findings of primary analysis – looks at the effect of assumptions or variations in approach.

116
Q

Outline PRISMA

A

Preferred Reporting Items for Systematic Reviews and Meta-Analyses. These are guidelines aimed to improve the reporting of systematic reviews. First published 2009; updated 2020.