Statistical tests Flashcards
T test (independent)
Parametric test
Compare means between two independent groups
Outcome: continuous
groups: 2
samples are independent
T test (paired)
Parametric test
Compare means before and after treatment in the same group
outcome: continuous
Groups: 2
samples are paired
Z test (2 sample)
Parametric
Compare means of two independent groups when sample sizes are large (n>30)
outcome: continuous
groups: 2
samples are independent
ANOVA (one way)
Parametric
Compare means across multiple independent groups
outcome: continuous
groups: more than 2
samples are independent
Repeated measures ANOVA
Parametric
Compare means across multiple conditions in the same subjects
outcome: continuous
groups: more than 2
samples are paired
Chi Square
Non parametric
Test for independence between categorical variables when samples are larger - expected counts >5
outcome: categorical
groups: 2 or more
samples are independent
Fishers exact test
non parametric
Test for independence in small sample (expected counts <5) categorical data
outcome: categorical
groups: 2
McNemar’s test
non parametric
Test for paired categorical data (e.g., pre/post intervention)
outcome: categorical (binary)
groups: 2
samples are paired
non paired version is chi square test or fishers exact
assumptions:
- one nominal variable with two categories (i.e. dichotomous variables) and one independent variable with two connected groups.
- The two groups in your the dependent variable must be mutually exclusive.
- Your sample must be a random sample.
e.g. whether individuals are a smoker or non-smoker before and after an intervention
Wilcoxon signed rank test
non parametric
Compare ranks or medians between two paired groups
outcome: ordinal or continuous
groups: 2
samples are paired
parametric version is paired t test
Mann whitney U test aka (Wilcoxon Rank-Sum)
non parametric
Compare medians of two independent groups when data is non-normally distributed
outcome: ordinal or continuous
groups: 2
data are not paired
parametric version is student’s independent sample t test
Kruskal-Wallis Test
non parametric test
Compare multiple independent groups when data is non-normally distributed
outcome: ordinal or continuous
groups: more than 2
groups are independent
parametric version is ANOVA
Friedman test
non parametric
Compare repeated measures in non-normally distributed data
outcome: ordinal
groups: more than 2
data are paired
repeated measures ANOVA is the parametric version
Pearson coefficient
parametric
Assess linear correlation between two variables
outcome and dependent variables: continuous
data are not paired
Spearman correlation
non parametric
asses monotonic relationship between two variables (monotonic = A relationship where the values of the two variables increase or decrease together)
outcome and independent variables = continuous
parametric version = pearson correlation
Linear regression
Parametric
Predict a continuous outcome from independent variables - uses a straight line equation. the gradient shows the increase in y for every 1 unit increase in x
dependent variable: continuous
assumptions:
linear relationship
Normality of residuals
- Constant Variance
- Independent Observations
Logistic Regression
parametric
estimates the probability of an event occurring, such as voted or didn’t vote, based on a given data set of independent variables.
outcome variable: binary
predictors can be continuous or categorical
output is log(odds) which can be converted to the odds ratio
Assumptions:
- Binary outcome variable
- Linearity of log odds and independent variables
- Independent Observations (not paired or matched)
no multicollinearity - independent variables should not be correlated with each other
- large sample size with No significant outliers
cox regression
investigating the association between the survival time of patients (time to event data) and one or more predictor variables (categorical or continuous)
Outcome: time to event data
results are on a log scale
output is hazard ratio (can be interpreted like risk)
assumptions:
- proportional hazards
- independent observations
- no relationship between probability of being censored and the event of interest
Z test of proportions
Compare proportions between two independent groups when sample sizes are large
Multiple linear regression
An extension of linear regression where multiple independent variables predict a dependent variable. Can be used to adjust model for confounders. if there is an association in single regression but this is smaller or not significant in multiple regression –> suggests that factor has less effect on the outcome when other variables are considered
Poisson regression
Type of regression used for outcomes which are counts of independent events (rates). Result is on a log scale. Output is a rate ratio.
Can test to see if predictors have significant impact on outcome.
predictors can be continuous or categorical
assumptions:
- response variable is count data
- observations are independent (not paired
- counts follow the poisson distribution - the mean and the variance are equal
Regression
Regression is a statistical method that can be used to determine the relationship between one or more predictor variables and a response variable.