Statistics Flashcards
Hypothesis-generating study designs
- observational
- survey
- case report/series
Hypothesis testing study designs
Experimental -randomized Observational -cross-sectional -case-control -cohort -other
Meta-analysis
Pooled data of observational studies
Cross-sectional
Single point in time
Temporal trends
Cohort
Onset of observation with the exposure
Estimates incidence/rate of exposures and outcomes
Case-control
Compare the frequency of exposure between patients who have/have not experienced outcome of interest
Search for risk factors
Case-report
Highlight an unusual procedure or event
Continuous variable
Can take on any number of values within a specified range of possibilities
Ex: Age, length of stay
Categorical variables
Have discrete values
Ex: binary (sex), ordinal (ordered categorical variables such as cancer stage), nominal (unordered categorical variables such as race)
Time-to-event variables
Two variables: continuous variable that measures the time interval from an established start point (ex: date of diagnosis) to failure event (ex: death) and a binary variable which indicates whether the failure event occurred
Ex: long-term survival
Measurement of continuous variables
Mean (for normally distributed data)
Median (for skewed data)
Descriptive statistics for continuous variables
Unpaired t-test
Paired t-test
ANOVA
Multivariate regression model for continuous variables
Linear
Need 10-15 observations per variable
Measurement of categorical variables
Proportion
Descriptive statistics for categorical variables
Chi-squared test
Mantel-Haenszel odds
Multivariate regression model for categorical variables
Logistic
Need at least 10 events and equivalent number of non events per variable
Measurement for time-to-event variables
Kaplan-Meier
Descriptive statistics for time-to-event variables
Log-rank test
Multivariate regression model for time-to-event variables
Cox hazard
Unpaired t-test
Compare 2 independent groups with continuous outcome variables
Paired t-test
Compare 2 dependent groups with continuous outcome variables
ANOVA
Compare more than 2 groups with continuous outcome variables
Chi-squared
Compare distributions of 2 or more groups with categorical outcome variables (sex, mortality)
Fisher exact
Compare distributions of 2 or more groups with categorical outcome variables with small sample size
Log-rank test
Compare 2 groups with time-to-event outcome variables
Alpha (type 1) error
Observe a difference when one does not exist
False-positive
Beta (type 2) error
No difference is observed when when one actually exists
False-negative
Insufficient power to detect true differences, directly related to sample size
Confidence interval (CI)
Difference between groups are provided as estimated ratio or absolute difference
Odds ratio/relative risk ratio: if includes 1, no statistical difference
Absolute difference/relative risk: if includes 0, no statistical difference
Wide confidence interval
Lack of precision
Tight confidence interval
Minimal uncertainty
PICOT framework to summarize research question
Population in the study Independent variables (intervention/exposure, covariates) Comparator group, if applicable Outcome, end point (dependent variable) Time frame of outcomes assessment
Confounder
Measured or unmeasured variable associated with the exposure of interest and associated with the outcome
Generalizability
Ability to take research findings and apply them to clinical practice
-is this reproducible in a clinical setting? In my patient population?
Bradford Hill criteria for causality
Strength of association
Consistency: do all or most studies indicate that A causes B?
Specificity
Temporality: if A causes B, then A must precede B. Just because A precedes B, A does not necessarily cause B.
Biological gradient (dose-response): the more a person is exposed to A, the more likely they will get disease B
Plausibility: there should be a reasonable biological mechanism to explain why A causes B
Coherence: should make sense with what we already know about A and B
Experiment
Analogy
Wilcoxon test
Used to study the relationship between an ordinal variable such as satisfaction scores, in 2 samples (before and after treatment)
Kruskal-Wallis test
Used for ordinal data from 3 or more groups
Relative Risk Reduction formula
(Incidence in unexposed - incidence in exposed) / incidence in unexposed
Risk ratio
Used in cohort studies and RCT
data is collected prospectively
Calculate incidences and incidence rates and compare these as risk ratios
Odds ratios
Used in case-control studies
Only prevalence rates can be calculated
Also used to summarize data from cohort studies and RCTs
True or False: PPV and NPV vary depending on the prevalence of disease
True
As disease prevalence increases, more people actually have the disease (increase in TP) and fewer people do not have the disease (decrease in TN)
Increase in TP signifies a higher PPV. Decrease in TN signifies a lower NPV
True or False: sensitivity and specificity vary depending on prevalence of a disease
False
Student’s t-test
Test for normally distributed continuous variables
Mann-whitney test
Used for non-normally distributed continuous variables
Absolute Risk Reduction
Difference in rates between the control group and the experimental group
Incidence in unexposed - incidence in exposed
Relative Risk
Computes the possibility of disease when exposed to a certain agent relative to the risk of disease when not exposed to the same agent
Incidence in exposed / incidence in unexposed
Odds ratio
Measure of association between an exposure and an outcome
Diseased in exposed / healthy in exposed) / (diseased in not exposed / healthy in not exposed
Hazard ratio
Hazard rate of one exposure variable relative to the hazard rate of another exposure variable