Biostatistics Flashcards
What are the properties of a normal distribution?
- Mean = Median = Mode
- Interquartile ranges equally distributed about mean
- 2/3 data lie within 1 SD of the mean. 95% data lie within 2 SD.
What is a range?
Difference between the highest and lowest value.
What is sample variance?
Variance = sum(data-mean)^2/n-1
What is standard deviation (SD)?
SD = √Variance
What types of data are there?
- Continuous
- Ordinal
- Nominal
What is continuous data?
Infinite number of data points on a continuous scale with no set intervals.
What is discrete data?
Finite number (countable) of data points.
What is ordinal data?
- Variables can be ranked on finite scale (e.g. 1-10).
- Considered non-parametric.
What is nominal data?
- Qualitative
- Non-numerical and categorical (e.g. hair colour, sex…)
- Non-parametric
What are the features of a box & whisker plot?
- Central line = Median
- Mean is usually a cross
- Boxes contain the interquartile ranges
- Outliers are indicated as points beyond the whiskers
What is the coefficient of variation?
Coefficient of variation = variance/mean
What is the standard error of the mean (SEM)?
- The standard deviation of the sampling means about the population mean.
- SEM = σ/√n
What is a confidence interval?
A 95% confidence interval means that the true mean falls within a calculated confidence interval 95% of the time.
What is a null hypothesis (H0)?
Statement claiming that there is no difference between 2 populations.
What is an alternative hypothesis (H1)?
Statement claiming that there is a difference between 2 populations.
What types of alternative hypotheses are there?
- Mean of A ≠ Mean of B
- Mean of A > Mean of B
- Mean of A
What is a test statistic?
The numerical value relating to a set of data used to determine whether to accept/reject a null hypothesis.
What is a type I error?
- When a false positive is obtained.
- H0 is rejected when it was true.
What is the probability of obtaining a type I error (α)?
- α = Significance level of a test
- For test with 5% significance, chance of getting type I error is 5%.
What is a type II error?
- When a false negative is obtained.
- H0 is accepted when it was false.
What is the probability of obtaining a type II error (β)?
- β is dependent on the sample size.
What is the power of a study?
- The probability that H0 is correctly rejected.
- Power = 1 - β.
What is the significance of power?
- If a statistically non-significant result was obtained and the power was large, then the result is probably correct.
- If a statistically non-significant result was obtained and the power was small, then there are 2 possibilities:
1. There wasn’t a significant difference.
2. There was a significant difference but it was not detected.
What is multiple testing?
When multiple statistical tests are being applied to multiple sets of data (relating to the same parameters) simultaneously.
What is the problem with multiple testing?
One test may turn out to be significant just by chance.
How do we correct for multiple testing?
- Use the Bonferroni correction, whereby the new critical value = Original critical value/number of tests.
What is an unpaired t-test used for?
Used to compare the means of 2 independent groups of measurements.
What are the assumptions required for an unpaired t-test?
- Independence
- Normally distributed
- Variance is the same in both groups
What is the name given to 2 sets of data with the same variance?
Homoscedastic
How can homoscedasticity be determined?
Using Levene’s test (F-test)
How are the degrees of freedom calculated for a t-test?
Degrees of freedom = Sample size (n) - 2
What is a paired t-test used for?
When 2 means of results taken from paired population are compared before and after a specific intervention.
What are the types of pairing?
- Self-pairing
- Natural pairing (e.g. siblings)
- Matched pairs (sex, size, weight…)
What are the assumptions required for a paired t-test?
- Unbiased selection
- Data is normally distributed
How is variance analysed?
- There are many sets of data corresponding to different groups.
- The mean and variance of each set of data is calculated individually and the variance is summed (A).
- Data from each set is merged into one big set of data.
- Variance of this set of data calculated relative to the overall mean (B).
- If A = B, group means are practically equal to each other.
What are one-way ANOVA tests used for?
Used to test whether there are any significant differences between the means of multiple groups of data.
What are two-way ANOVA tests used for?
Used to test whether there are any significant differences between the means of multiple groups of data with multiple input variables.
What are Wilcoxon rank sum tests used for?
Used to determine whether there is significant difference between the means of a pair of related ranked data.
What are Mann-Whitney U tests used for?
Used to determine whether there is significant difference between the means of 2 sets of independent data that are not normally distributed.
What are Kruskal-Wallis one-way ANOVA tests used for?
Extension of Mann-Whitney U test. Used to determine whether there is significant difference between the means of multiple sets of independent data that are ranked or not normally distributed.
What are Friedman two-way ANOVA tests used for?
Used to determine whether there is significant difference between the means of multiple sets of related data that are ranked or not normally distributed.
What is Spearman’s rank correlation coefficient used for?
To determine whether there is statistical dependence between non-parametric data.
What are the advantages of odds ratios compared to relative risk?
- Odds ratios can be determined for case-control (retrospective) studies
- Case control studies allow for covariant adjustments to be easily made
What can statisticians help accomplish?
- Obtain relevant sample size
- Help with study design
- Conduct analysis
- Get paper published