Stats Flashcards
The p value
the probability of obtaining a result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.
- Type I error
the null hypothesis is rejected when it is true - i.e. Showing a difference between two
groups when it doesn’t exist (= significance level)
(False positive)
Type II error
the null hypothesis is accepted when it is false - i.e. Failing to spot a difference when
one really exists
power of a study
probability of (correctly) rejecting the null hypothesis when it is false
* power = 1 - the probability of a type II error
* power can be ↑ by increasing the sample size
Correlation tests
Parametric (normally distributed): Pearson’s coefficient
* Non-parametric: Spearman’s coefficient
Parametric tests
Student’s t-test - paired or unpaired
* Pearson’s product-moment coefficient - correlation
Non-parametric tests
Mann-Whitney - unpaired data
- Wilcoxon matched-pairs - compares two sets of observations on a single sample
- Chi-squared test - used to compare proportions or percentages
- Spearman, Kendall rank – correlation
- McNemar’s test is used on nominal data to determine whether the row and column marginal frequencies are equal
Funnel Plot
primarily used to demonstrate the existence of publication bias in meta-analyses. Funnel plots are usually drawn with treatment effects on the horizontal axis and study size on the vertical axis.
Central Limit Theorem (CLT)
the random sampling distribution of mean would always tend to be normal irrespective of the population distribution for which the sample were drown.
The mean of the random sampling distribution of means is equal to the mean of the original population
Confidence Interval (CI):
describes the range of value around a mean, an odds ratio, a pvalue or a standard deviation within which the true value lies.
95% CI → 5% chance the true mean value for variable lies outside the range CI = mean ± 2xSE (Standard Error)
Normal Distribution
known as Gaussian distribution or ‘bell-shaped’ distribution. It describes the spread of many biological and clinical measurements
Standard deviation
The standard deviation (SD) represents the average difference each observation in a sample lies
from the sample mean
* SD = square root (variance)
Positively skewed distribution:
mean > median > mode
Negatively skewed distribution
mean < median < mode
The Standard Error of the Mean (SEM
is a measure of the spread expected for the mean of the observations - i.e. how ‘accurate’ the calculated sample mean is from the true population mean