Statistics Flashcards
Definition of funnel plot
a scatterplot of treatment effect (e.g. OR on x axis) against a measure of study precision (e.g. SEM on y axis)
Commonly used in meta-analyses to assess publication bias
Definition of forest plot
graphical display of estimated results from a number of scientific studies addressing the same question
gives visual suggestion of the amount of heterogeneity
and can show the estimated common effect
Definition of standard deviation
measure of amount of variation or dispersion of a set of values
Definition of per-protocol analysis
only subjects who completed the entire protocol are included in the analysis of a randomised clinical trial
Definition of intention to treat analysis:
all subjects randomised are included in the analysis regardless of whether they completed the study
Definition of clinical equipoise
state of genuine uncertainty on relative value of two interventions being compared in a trial - requirement for RCTs to be ethical
Definition of MHRA Yellow Card scheme
provides early warning that safety of a medicine or medical device may require further investigation
Definition of standard error
SD / square root of (N)
looks at how accurate the mean of the study population is compared to the true population
(whereas standard deviation compares how participants in the study population compare to each other)
Definition of null hypothesis
hypothesis that there is no significant difference between specified populations
Definition of type I error
falsely rejecting a null hypothesis that is true in the population (false positive)
Definition of type II error
failing to reject null hypothesis that is false in the population (false negative)
Definition of power
probability of picking up a significant difference, if there is one (probability of not making type 2 error (false negative))
1 – probability of Type II error
Definition of p-value
probability of event happening by chance = probability of wrongful rejection of the null hypothesis = probability that the null hypothesis is true = type 1 error
Definition of confidence interval
range within which the true answer will lie 95% of the time
Definition of a priori
pre-specifying end-points & outcomes of a study to reduce reporting bias
Definition of surrogate endpoint
variable relatively easily measured that predicts a distant outcome of the intervention being tested
Definition of composite outcome
combination of two or more outcomes into single endpoint
Definition of Cohen’s kappa coefficient
statistic used to measure inter-rater reliability (degree of agreement between raters/observers )for qualitative variables
Definition of absolute risk
probability that an event will occur (incidence)
number of events/total number of people
Definition of absolute risk reduction
difference in rate of events between 2 groups
ARR = AR (C) – AR (T)
Incidence in control - incidence in intervention
Definition of relative risk
risk ratio = relative likelihood of an event occurring in the treatment vs control group throughout study period
RR = AR (T) / AR (C)
cumulative risk
Definition of relative risk reduction
reduction in rate of outcome in treatment group vs. control group
RRR = ARR / AR (C)
Definition of number needed to treat
number of pts needed to treat to prevent 1 additional bad outcome, e.g. death, stroke
NNT = 1 / ARR
Definition of hazard rate
probability of the event occurring in the next time interval divided by the length of that time interval
time-sensitive = instantaneous risk
Definition of hazard ratio
relative likelihood of an event occurring in the treatment vs control group at any given point
Definition of logistic regression
statistical analysis method to predict a binary outcome, such as yes or no, based on existing independent variables
Definition of linear regression
regression model that estimates relationship between one independent variable and one dependent variable using a straight line
Definition of chi-squared test
hypothesis test to determine whether observed frequencies are significantly different to expected frequencies if the null hypothesis was true
categorical variables
Definition of t-test
hypothesis test to determine whether means of two groups are significantly different from each other
continuous variables
Definition of ANOVA
hypothesis test to determine whether means of three or more groups are significantly different from each other
continuous variables
Definition of log-rank test
hypothesis test to compare the survival distributions of two samples
Definition of Kaplan Meier curves
probability of survival curves for categorical values
Definition of Cox proportional hazards regression analysis
survival analysis for both quantitative & categorical variables, which can simultaneously assess the effect of several risk factors on survival time
Definition of correlation coefficient
how closely 2 continuous variables move with each other
Types of correlation coefficient
parametric: Pearson’s R
non-parametric: Spearman’s rank correlation Rho
Definition of receiver-operating characteristic (ROC) curve
a graph showing the performance of a classification model at all classification thresholds. This curve plots two parameters: True Positive Rate. False Positive Rate
ROC curve axes
X axis: 1-specificity (false +ves)
Y axis: sensitivity (true +ves)
Funnel plot axes
X axis: study outcome, e.g. OR
Y axis: study precision, e.g. SEM
Measure of forest plot heterogeneity
I squared
How do you calculate relative risk?
RR = A/(A+B) / C/(C+D)
(those who got the disease in all exposed vs those who got the disease in all not exposed)
What types of studies can RR be used in?
Prospective studies
Odds ratio
Ratio of odds of something happening vs the odds of something not happening with a particular exposure
Which studies is odd ratio used in?
Case-control studies
How do you calculate odds ratio?
OR = A/C / B/D
Odds of exposure in the cases, vs odds of exposure in the controls
(the odds of getting the disease when exposed vs the odds of not getting the disease when exposed)
If a disease is really rare, the odds ratio and relative risk actually end up being quite similar. True or false?
True - however, they are not the same thing … and most times they end up being very different
When would hazard ratio be used?
Useful when the risk is not constant with respect to time - it uses data from different time points where the risk might be changing over a period of time
Relative risk 1.45 in plain language
45% more likely to have outcome X
Case control, looking at exposure to risk factors in patients that had oral cancer. Looking at risk factor chewing tobacco, OR is 1.6. Explain this in plain language?
In those who had oral cancer, the odds of chewing tobacco were 1.6 times higher than those who did not have oral cancer.
Odds ratio 1.6 in plain language (compared to RR 1.6)
Odds ratio of 1.6 means the odds of disease is 60% higher in exposed people
Whereas risk ratio of 1.6 means exposed people are 60% more likely to be diseased
Hazard ratio 0.79 in plain language
At any particular point, group A is 21% less likely to have outcome X
Incidence
Number of new cases of a disease within a specific period of time
Prevalence
Number of cases of disease at a given time
Number needed to treat (NTT)
1/ARR -> tells you how many people need to be treated with that intervention in order to prevent one outcome occurring
What is relative risk reduction (RRR)?
Compared ARR?
ARR / incidence [control group] as %
RR of 0.8 = RRR of 20%
Relative risk reduction (RRR) refers to the percentage decrease in risk achieved by the group receiving the intervention vs. the group that did not receive the intervention (the control group). Absolute risk reduction (ARR) refers to the actual difference in risk between the treated and the control group.
What are the causes of type 1 error?
bias
confounding
data dredging
What are causes of type 2 error?
Sample size too small
Measurement variance being too large
Beta
Probability of making a type II error (under 0.8 and we are not too fussed?)
(alpha is the probability of making a type I error)
How can we increase power?
Increase sample size
Increase effect size
Increase measurement precision
Advantages of per protocol analysis
Accurate representation of the effect of the intervention because you have only included the people who have properly done the intervention.
Disadvantages of per protocol analysis
Susceptible to attrition bias and exclusion bias
Advantages of ITT analysis
More accurate of results in clinical practice because in practice patients do not always follow instructions/protocols
More generalisable
Disadvantages of ITT analysis
Not getting a true, accurate estimate of how well the drug actually does in optimal conditions
Imputed values may be inaccurate
Null hypothesis
The assumption that any difference between experimental groups is due to chance
Methods for dealing with missing data (in ITT analysis)
Worst-case scenario
Hot deck imputation: fill in missing values from similar subjects with complete records
Last observation carried forward
Standard deviation of data interpretation
The narrower the standard deviation, the less important it is to have a large sample size
Parametric, paired, 2 groups
Paired t-test
Parametric, paired, >2
One way ANOVA
Parametric, unpaired, 2 groups
Independent t-test
Parametric, unpaired, > 2 groups
One way ANOVA
Non-parametric, paired, 2 groups
Wilcoxon signed rank
Non-parametric, paired, > 2 groups
Friedman test
Non-parametric, unpaired, 2 groups
Mann-Whitney U test
Non-parametric, unpaired, > 2 groups
Kruskal Wallis test
Parametric data is
data that assumes a normal distribution. When data sets are large enough, parametric statistical tests can be employed regardless of normality. Parametric tests are generally considered to have greater statistical power.
Non-parametric data is
data that does not assume a normal distribution. The data is ordinal, ranked, or has outliers that cannot be removed.
Time to event analysis: based on Kaplan-Meir curve. Can use:
Cox proportional hazards, log-rank or Wilcoxon two-sample test. Cox model is the most used.
Retrospective subgroup analysis
Data dredging means that some associations will crop up due to chance. Dredging: “cherry-picking of promising findings leading to a spurious excess of statistically significant results in published or unpublished literature”.
How to compare if two Kaplan-Meir curves
are different?
Log-rank test
How to do power calculations
Power is the ability to discern a certain difference if that difference exists. You usually pick a clinically meaningful difference. You need a population mean and standard deviation AND:
▪ The standard deviation of the test group
▪ The clinically meaningful difference of the test group
▪ Then you can calculate the size of the sample you need for certain power
Effect size
effect size is the magnitude of the difference between groups.
The absolute effect size is the difference between the average, or mean, outcomes in two different intervention groups.
Nominal data
a type of qualitative data which groups variables into categories
ie hair colour
Ordinal data
a kind of qualitative data that groups variables into ordered categories.
ie range of income, or level or education
Interval data
a data type which is measured along a scale, in which each point is placed at equal distance from one another
ie temperature in degrees, time in minutes
Ratio data
a form of quantitative (numeric) data.
ie height, weight,
Paired vs unpaired samples
Paired means that both samples consist of the same test subjects
Unpaired means that both samples consist of distinct test subjects
Alpha level
also known as the significance level
is the probability of rejecting the null hypothesis when it is true
type 1 error - false positive
Inferential testing
Inferential statistics allow you to test a hypothesis or assess whether your data is generalizable to the broader population
Correlation
Correlation is a statistical measure that expresses the extent to which two variables change together at a constant rate.
Regression
a statistical technique that relates a dependent variable to one or more independent (explanatory) variables
Correlation vs regression
Correlation and regression are techniques used to analyze the relationship between two quantitative variables.
While correlation measures the strength of a linear relationship between two variables, regression measures how those variables affect each other using an equation.
Degrees of freedom
degrees of freedom in a statistical calculation represent how many values involved in a calculation have the freedom to vary
calculated to help ensure the statistical validity of chi-squared tests or t-tests etc
Validity
Statistical validity can be defined as the extent to which drawn conclusions of a research study can be considered accurate and reliable from a statistical test
Accuracy
Accuracy is how close a given set of measurements (observations or readings) are to their true value,
Precision
the agreement among repeated measurements of the same variable.
Variance
term variance refers to a statistical measurement of the spread between numbers in a data set
how far each number in the set is from the mean
Placebo
a substance that has no therapeutic effect, used as a control in testing new drugs.
What is within participant comparison
Participants are assessed before and after an intervention
Analysis is of the same participant
What is an N-of-1 trial
A single subject trial where an individual is the sole observation
Provides optimal intervention for an individual (e.g. optimal dose)
What is a factorial design
Study that investigates multiple independent variables on an outcome measure (both separately and combined)
Number needed to treat
Number of participants required to take a medication/have an intervention (compared with the control) to see one positive event
Is 1/ARR
Sensitivity
How well the test is able to detect those with the disease
True Positive (correctly detected with disease) /True Positive +False Negative (total with disease)
Specificity
How well the test is able to rule out those without the disease
True Negative (correctly detected without disease) /True Negative + False Positive (total without disease)
Positive predictive value
The percentage of people that test positive, that truly have the disease
True Positive (correctly detected with disease / True Positive +False Positive (total that tested positive)
Negative predictive value
The percentage of people that test negative, that truly do NOT have the disease
True Negative (correctly detected without disease)/ True Negative + False Negative (total that tested negative)
Number needed to harm (NNH)
derived statistic that tells us how many patients must receive a particular treatment for 1 additional patient to experience a particular adverse outcome.
Lower NNT and higher NNH values are associated with a more favorable treatment profile
Outlier
An outlier is an observation that lies an abnormal distance from other values in a random sample from a population
Extreme values that stand out greatly from the overall pattern of values in a dataset