Biostatistics Flashcards
random variables
a variable whose observed values may be considered outcomes of an experiment and whose values cannot be anticipated with certainty before the experiment is conducted
discrete variables
random variable that can only take a limited number of values within a given range (e.g. nominal (unordered, no relative severity, e.g. gender); ordinal (ranked in specific order with no consistent level of magnitude between ranks e.g. NYHA classification))
Continuous variables
random variables that can take on any value within a given range.
Interval: data ranked in a specific order with consistent change in magnitude b/t units but the zero point is arbitrary (e.g. degrees Fahrenheit)
Ratio: like interval but w/absolute zero (e.g. degrees Kelvin, HR, time)
Descriptive statistics
used to summarize and describe data that are collected or generated in research
i.e. visual methods of describing data, measures of central tendency, measures of data spread or variability
visual methods of desribing data
frequency of distribution, histogram, scatterplot
Measures of central tendency
mean, median, mode
Measures of data spread or variability
standard deviation, range, percentiles
Inferential stats
conclusions or generalizations made about a population from the study sample.
Normal (Gaussian) distribution
most common model for population distributions; symmetrico r “bell-shaped” frequency distribution
Kolmogorov-Smirnov test
formal test for a visual check of a Gaussian distribution
Parametric tests
assume that data (i.e. parent population) have an underlying distribution that is normal or close to normal and that variances are homogeneous between the groups investigated
nonparametric tests
used when data are not normally distributed or do not meet other criteria for parametric tests
One-sample Student t-test
parametric test that compares the mean of the study sample with the population mean
Two-sample, independent samples, or unpaired Student t-test
parametric test that compares the means of two independent samples.
F test
formal test for differences in variances with two-sample, independent samples, or unpaired Student t-test
Paired Student t-test
parametric test; compares the mean difference of paired or matched samples. A related samples test.
Analysis of variance
Parametric test; more generalized version of the t-test that can apply to more than two groups
One-way ANOVA, aka single-factor ANOVA
Analysis of variance; compares the means of 3 or more groups in a study. An independent samples test.
Two-way ANOVA
Analysis of variance; additional factors added to one-way ANOVA
Repeated-measures ANOVA
Analysis of variance; related samples test
Analysis of covariance
provides a method to explain the influence of a categorical variable (independent variable) on a continuous variable (dependent variable) while statistically controlling for other variables (confounding)
Wilcoxon rank sum and Mann-Whitney U test
nonparametric tests that compare two independent samples (related to a t-test)
Kruskal-Wallis one way Anova by ranks
nonparametrics test that compares 3 or more independent groups (related to one-way ANOVA); post hoc testing
Sign test & Wilcoxon signed rank test
nonparametric tests that compare 2 matched or paired samples (related to a paired t-test)
Friedman ANOVA by ranks
nonparametric test that compares 3 or more matched/paired groups
Chi-square test
nominal data; compares expected and observed proportions between 2 or more groups. Test of independence and goodness of fit.
Fisher exact test
nominal data; specialized version of the chi-square test for small groups (cells) containing less than 5 predicted observations
McNemar
nominal data; paired samples
Mantel-Haenszel
nominal data; controls for the influence of confounders
Type I decision error
probability of making this error is defined as a (alpha). alpha usually set to 0.05, meaning that 5% of the time a researcher will conclude a statistical difference when one does not actually exist
p-value
the calculated chance that a type I error has occurred
Type II decision error
probability of making this error is termed B (beta). Concluding that no difference exists when one truly does. Beta usually set to between 0.20 and 0.10
Power
The probability of making a correct decision when null hypothesis is false; the ability to detect differences between groups if one actually exists
Correlation
examines the strenth of the association between two variables; does not assume one variable predicts the other
Regression
examines the ability of one or more variables to predict another variable
Pearson correlation
the strength of the relationship b/t 2 variables that are normally distributed, ratio or interval scaled, and linearly related is measured with a correlation coefficient; the degree of association b/t 2 variables
Spearman rank correlation
nonparametric test that quantifies the strength of an association b/t 2 variables but does not assume a normal distribution of continuous data; can be used for ordinal data or nonnormally distributed continuous data
Kaplan-Meier method
uses survival times to estimate the proportion of people who would survive a given length of time under the same circumstances
Log-rank test
compares the survival distributions b/t 2 or more groups
Cox proportional hazards model
survival analysis; most popular method to evaluate the impact of covariates; reported (graphically) like Kaplan-Meier
ARR
The absolute
difference in rates of an outcome between
treatment and control groups in a clinical trial.
Example: A hypothetical clinical trial compares
the effect of a new statin and placebo on the
incidence of stroke. Over the course of the study,
the incidence of stroke is 4% with the statin and
6% with placebo. The absolute risk reduction
with the statin is 2%.
bias
Bias: Flaws in the design or operation of a study
that lead to overestimation of the efficacy of
treatment. Bias can more easily be introduced
into studies that are not blinded. There are many
different ways in which bias can be introduced
into a study.
Publication bias: Investigators tend not to publish
studies with negative outcomes. This can lead to
overestimation of efficacy in meta-analysis when
studies with positive outcomes are overly
represented.
Recall bias: People may remember things
differently than how they occurred.
Selection bias: Differences between treatment
and control groups that result from the way
patients were selected. Randomization and
blinding should help prevent selection bias.
Relative risk reduction
1 - RR
Relative risk
compares the risk of an event in
individuals with a particular characteristic to the
risk of that event in individuals without that
characteristic. In a clinical trial, this would be the
outcome in the treatment group divided by the
outcome in the control group.
Odds Ratio
the odds of exposure in cases divided by the odds
of exposure in controls.9 It is analogous to
relative risk.7 Unlike relative risk, it can be used
in case-control studies.
NNT/NNH
NNT is the
reciprocal of the absolute risk reduction with drug
treatment (1 divided by absolute risk reduction).
they have less potential to be misleading because
they are based on absolute risk.