Statistics Flashcards
p-value
Probability of observed result or one more extreme occurring when the null hypothesis is true.
The probability of getting the observed results by chance. When p is greater than the alpha level, the results are statistically significant.
p<0.05
statistically significant, can reject null hypothesis
95% confidence interval
A range, between which the population mean value will lie 95% of the time (NOT there is a 95% chance the population mean will occur between those intervals) … so if you did that small study a hundred times, 95% of the time the population mean value would lie within the confidence interval
Sample point estimate mean (vs. population mean)
…
Relative risk
Risk of developing disease in the exposed group compared to the risk of developing disease in the unexposed group
How do you calculate relative risk?
RR = A/(A+B) / C/(C+D)
(those who got the disease in all exposed vs those who got the disease in all not exposed)
What types of studies can RR be used in?
Prospective studies
Odds ratio
Ratio of odds of something happening vs the odds of something not happening
Which studies is odd ratio used in?
Case-control studies
How do you calculate odds ratio?
OR = A/C / B/D
(the odds of getting the disease when exposed vs the odds of not getting the disease when exposed)
If a disease is really rare, the odds ratio and relative risk actually end up being quite similar. True or false?
True - however, they are not the same thing … and most times they end up being very different
Hazard ratio
Broadly equivalent to relative risk (RR); useful when the risk is not constant with respect to time (so it uses data from different time points, where the risk might be changing over a period of time. Usually hazard ratio is used in the context of survival but in statistics survival does not mean life or death, it could be whether or not a patient got a disease/survived or not. Hazard ratio takes into account the principle of time whereas risk doesn’t)
Relative risk 1.45 in plain language
45% more likely to have outcome X
E.g. one group drank coffee, the other group didn’t drink coffee, outcome is tachycardia, RR is 1.45. How would you explain this?
In the group where patients drank coffee were 45% more likely to have tachycardia/ probability of having tachycardia is 45% higher in the group that drank coffee (1.45 times more likely to have tachycardia … too complex. If RR is 5.1 could say 5 times more likely to have tachycardia - swap from percentage to the number)
Case control, looking at exposure to risk factors in patients that had oral cancer. Looking at risk factor chewing tobacco, OR is 1.6. Explain this in plain language?
In those who had oral cancer, the odds of chewing tobacco were 1.6 times higher than those who did not have oral cancer.
An OR or RR means…
there’s no difference
Odds ratio 1.6 in plain language
Odds of exposure to factor X is 1.6 times higher
E.g. RCT, comparing recurrence of cancer following use of a new chemotherapy drug, HR 0.79, explain this in plain language
Those who receive the chemotherapy drug at any point during this study were 21% less likely to have cancer occurrence. Similar to RR but taking into account time in the study. Hazard ratio of 1 means there is no difference.
Hazard ratio 0.79 in plain language
At any particular point, group A is 21% less likely to have outcome X
Incidence
Number of new cases of a disease within a specific period of time
Prevalence
Number of cases of disease at a given time
Absolute risk reduction (ARR)
Incidence [group 1] - incidence [group 2]
Number needed to treat (NTT)
1/ARR -> tells you how many people need to be treated with that intervention in order to prevent one outcome occurring
Why is NTT useful?…
…
What is relative risk reduction (RRR)?
ARR / incidence [control group] as %
Type I error
Reject H0 when (statistically significant results) when H0 is actually true - false positive.
Due to bias, confounding, data dredging
Type II error
False negative, wrongful acceptance of the null hypothesis = beta = 1-alpha.
Due to the sample size being too small or measurement variance being too large.
Beta
Probability of making a type II error (under 0.8 and we are not too fussed?)
Power
Ability to pick up difference when a difference exists. ‘Ability to reject a false H0’. Probability of not making a type II error.
How can we increase power?
Increase sample size, increase effect size, increase measurement precision
Per-protocol analysis
Where you include patients in analysis within the study only if they’ve finished doing the study protocol properly
Advantages of per protocol analysis
Accurate representation of the effect of the intervention because you have only included the people who have properly done the intervention.
Disadvantages of per protocol analysis
Susceptible to attrition bias and exclusion bias
Intention-to-treat analysis
Where you usually include all the patients who have at least attempted the intervention
Advantages of ITT analysis
More accurate of results in clinical practice because in practice patients do not always follow instructions/protocols
Disadvantages of ITT
Not getting a true, accurate estimate of how well the drug actually does in optimal conditions
Null hypothesis
The assumption that any difference between experimental groups is due to chance
Evidence-based medicine
The conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients
Hazard rate
The probability of an endpoint in a time interval divided by the duration of the time interval
Confounder
A confounder has a triangular relationship between the exposure and outcome but is not on the causal pathway. It makes it appear as if there is a direct relationship between the exposure and outcome (positive confounder) or might mask an association that would have been present (negative confounder)
Methods for dealing with missing data (in ITT analysis)
- Worst-case scenario
- Hot deck imputation: fill in missing values from similar subjects with complete records
- Last observation carried forward
Absolute risk
Incidence rate of the outcome = outcome in either control or experimental arm/total number of participants in arm
Relative risk reduction
Reduction in risk in control group vs. experimental group/risk in control group
= (CER-EER)/CER
Standard deviation of data interpretation
The narrower the standard deviation, the less important it is to have a large sample size
Parametric, paired, 2 groups
Paired t-test
Parametric, paired, >2
One way ANOVA
Parametric, unpaired, 2 groups
Independent t-test
Parametric, unpaired, > 2 groups
One way ANOVA
Non-parametric, paired, 2 groups
Wilcoxon signed rank
Non-parametric, paired, > 2 groups
Friedman test
Non-parametric, unpaired, 2 groups
Mann-Whitney U test
Non-parametric, unpaired, > 2 groups
Kruskal Wallis test
Parametric data is
data that assumes a normal distribution. When data sets are large enough, parametric statistical tests can be employed regardless of normality. Parametric tests are generally considered to have greater statistical power.
Non-parametric data is
data that does not assume a normal distribution. The data is ordinal, ranked, or has outliers that cannot be removed.
Time to event analysis: based on Kaplan-Meir curve. Can use:
Cox proportional hazards, log-rank or Wilcoxon two-sample test. Cox model is the most used.
Kaplan-Meier curves
These are commonly used to describe survival and compare it between groups. It provides an intuitive graphical representation. They are mainly descriptive. They do not control for covariates and cannot accommodate time-dependent variables.
Retrospective subgroup analysis
Data dredging means that some associations will crop up due to chance. Dredging: “cherry-picking of promising findings leading to a spurious excess of statistically significant results in published or unpublished literature”.
Kaplan-Meier Survival Plot
A plot that is utilized over the entire study period (no defined timepoint), which tries to account for censored data.
How to compare if two Kaplan-Meir curves
are different?
Log-rank test
How to do power calculations
Power is the ability to discern a certain difference if that difference exists. You usually pick a clinically meaningful difference. You need a population mean and standard deviation AND:
▪ The standard deviation of the test group
▪ The clinically meaningful difference of the test group
▪ Then you can calculate the size of the sample you need for certain power