Biostatistics Flashcards
Quantitative/Continuous
Variables that can theoretically have values between points (decimal points)
t-test, ANOVA, linear regression
Categorical/Discrete
No intermediate values possible (no decimal points)
Chi square, logistic regression (Cohen’s Kappa)
Central Tendency
Most frequently used is mean (sensitive to outliers)
Mean, Median, Mode
Measures of dispersion (3)
Range (highest to lowest)
Interquartile range (75th percentile minus 25th)
Variance (spread of data around the mean)
Dependent
Outcome of interest, in relation to the independent variable
Independent
Risk factors or indicators of disease, exposure
Parametric
Make an assumption about the underlying distribution
Nonparametric
Do not make any assumptions about the distribution, therefore considered robust
MED - Minimum Expected Difference
The smallest measured difference between comparison groups that the investigator would like this study to detect
P-value
Probability that the finding is because of chance, generally is less than 0.05 the test is statistically significant
Power
Ability to find a difference when there is a difference (80%), probability of rejecting the hypothesis
Confidence interval
A range that you think will contain the true population parameter that you are measuring
Measure of statistical significance and precision
Type 1 error
False discover, when we reject a true null hypothesis
Type 2 error
When we fail to reject the null hypothesis that is false
t-test
Comparing means of two samples for a statistically meaningful difference
Outcomes have to be continuous
Independent sample t-tests
Compares the mean of a sample to some specified value (some value)
Two sample t-test
Compares the means of two different independent samples, assumes normality and equality of sample size and variance (Ex. Compare class 2020 to 2021)
Paired t-test
Used for before and after tests or differences between experimental and placebo groups
Pearson’s correlation
Measure of strength of linear relationship ‘
r (slope) of 0 may or may not always mean lack of relationship
Positive, negative, no correlation, correlation but no r
Cohen’s Kappa
Used to measure observer agreement (inter examiner reliability)
Want value to be close to 1
Between people for consistency
Chi square
Compares the observed with the expected
Goodness of fit
Independence and Homogeneity
ANOVA
When two or more means are being compared
Similar to the t-test with more factors
Isolates and assesses the contribution of categorical independent variables to variation in the mean of a continuous dependent variable
One way ANOVA
There is only one factor that separates the groups into K groups
Two way ANOVA
Two categorical independent variables influence a continuous outcome variable
Simple linear regression
The outcome is continuous and there is only one continuous predictor (x—>y)
Multiple linear regression
Multiple predicts and a continuous outcome (x1,2,3,etc. —>y)
Logistic regression
When the outcome is categorical
Used to predict the probability of occurence
One or the other
Non-parametric tests
Based on ranks and used when distribution of the data is unknown (cannot make assumptions)
Non-parametric test examples
Wilcoxan signed rank test
Wilcox on ran sum test
Kurskal-Wallis test
Spearman Rank correlation