Statistics and Study Design Flashcards
What are descriptive studies vs analytical studies?
Descriptive studies: describe characteristics of a population/phenomenon (descriptive surveys, case reports, cross-sectional studies)
Analytical studies: test hypotheses, determine associations (observational - survey, cohort, case-control vs experimental - RCT)
- Analytical studies employ inferential statistical tools
Data Types
- Nominal
- Ordinal
- Interval
- Ratio scale
Nominal - includes dichotomous/binary (pregnancy yes/no) and categorical
Ordinal - ranked (e.g. OHSS severity)
Interval - units with linear relationship to each other but NO absolute zero (e.g. temperature in Celsius or Fahrenheit)
Ratio scale - absolute value CAN be zero (e.g. Kelvin temperature scale, weight in lbs, age)
Line charts are useful for…
Bar charts are useful for…
Pie charts are useful for…
Scatter plots are useful for…
Line charts - Seeing trends over time
Bar charts - Categories
Pie charts - Showing parts of a whole
Scatter parts - showing data distribution
What defines a normal distribtion?
Observations are independent with “thin” tails (few extremes)
Central limit theorem
sums of independent events will converge towards a normal distribution even if their underlying distributions are not normal
Define SD in a normal distribution
1 SD - 68.2%
2 SD - 95.4%
3 SD - 99.6%
Scope of data collection
- Census
- Sample or “study population”
Census - data compiled about the total population (e.g. SART)
Sample - process of drawing sample contains an element of randomness and may not truly represent the overall population
law of large numbers
if sample is large enough, the distribution of sample approximates the distribution of the population from which it was derived
Type 1 error vs Type 2 error
Type 1: null hypothesis is falsely rejected (false positive), alpha “arrogant”
Type 2: null hypothesis fails to be rejected and an actual relationship between populations is missed (false negative), beta “bashful”
Statistical calculations:
- Sensitivity
- Specificity
- PPV
- NPV
- Accuracy
- Precision
- ROC curve
SENSITIVITY (true positive rate): test’s ability to correctly identify those with the condition:
- True positive / (true positive + false negative = or total positives)
SPECIFICITY (true negative rate) = true negative / (true negative + false positive = or total negatives)
PPV: TP/(TP + FP = total test positives) = probability that a positive test accurately indicates presence of the condition
NPV: TN/(TN + FN = total test negatives)
ACCURACY = the proportion of all test results that are correctly identified, both positive and negative. Represents the overall effectiveness of the test across all outcomes. Also, how close is the measured value to the true value/standard?
- TP + TN / total population
PRECISION = the proportion of positive identifications that were correct (e.g. PPV). Also, the consistency of repeated measurements, indicating how close the measurements are to each other, regardless of their proximity to the actual/real value.
ROC curve: (plot of true positive rate as a function of false positive rate)
- Random classifier is a straight diagonal line
- A better test is shifted up and to the left
Measurement error types
- Random (noise)
- Systemic (bias)
- Missing data (either random or bias)
- Censoring (how was data excluded from analysis?)
p-value
Probability of observing the results
Lower p-value (<0.05 typically) indicates observed data are unlikely under the null hypothesis, indicating significant evidence against it
Statistical power
- Factors influencing?
- Relationship to type 2 error?
Probability that a test will correctly reject a false null hypothesis (detect a true effect)
Factors influencing power:
- Significance level (alpha)
- Sample size
- Effect size
- Variance
HIGH power REDUCES risk of type 2 error
POWER (typically 0.8) = 1- beta (beta=likelihood of type 2 error, typically 0.2)
Power calculation considerations
- For non-inferiority trials
- For superiority trials
Must be done a priori (prior to study), not post-hoc (after study)
Cannot do power calculation without prior info (effect size, variance) (e.g. pilot study)
Non-inferiority vs superiority design will affect power calculations:
- Non-inferiority trials in lieu of “equivalence” trials (would need infinite subjects for this): non-inferiority margin typically set as a specified upper bound of the 95% CI for a difference in outcomes between groups)
- Superiority trials are typically placebo trials and require fewer subjects
Standard deviation equation
Variance =
Standard error =
SD = square root [sum ((difference between each observation and the mean) ^2) / size of population]
Variance = SD ^2
Standard error = SD / square root (N)
*Appropriate for normal distributions, but not in skewing
What is a limitation of observational studies?
Higher risk of confounding variables
What is a limitation of experimental studies?
May not always replicate real-world conditions
Effect size
- Clinical vs statistical significance
- Cannot generally be estimated precisely unless population studied is very LARGE and statistical power is very HIGH
- Rough estimate, frequently represented as a 95% CI
- Overestimation of effect size is more likely than underestimation
Overestimation of effect size ___ as power decreases
increases
Confidence interval
Acknowledges that the mean of the sample population (considered a “point estimate”) will yield a different result than the mean of the population
Used to describe the likelihood that the actual population mean falls within a certain range
Error can be reduced by increasing sample size
Greater the confidence interval (99 vs 90%) the broader the interval
Totally unrelated to variance measurements. Assumes sample distribution will approximate a normal distribution
Relative risk (risk ratio)
Probability of outcome in an exposed group vs UNexposed group
Measures association between exposure > outcome
RR>1: outcome is increased by exposure
RR<1: exposure is protective factor
Odds ratio
Quantifies strength of association between 2 events (A and B)
Ratio of the odds of A in the presence of B : odds of A in the absence of B
Which statistic is used in case-control studies?
Odds ratio
The __ approximates the relative risk when the likelihood of outcome is rare
Odds ratio
What are the advantages/disadvantages of applying parametric statistics?
parametric statistics: data drawn from specific probability distribution (e.g. normal distribution)
ex: t-test, ANOVA, regression
provides estimates of mean, variance (population parameters)
unreliable if assumptions are violated
not suitable for all types of data
t-test can be used to determine…
one-sample t-test
independent samples t-test
paired t-test
If the means of 2 sets of data are significantly different from each other
one-sample t-test: compares mean of a single sample to a known mean (or expected value) to see if the sample mean significantly differs from the known mean
independent samples t-test: compares the means of 2 independent groups to determine if there is a statistically significant difference between them
paired t-test aka dependent t-test: compares means of 2 dependent groups, e.g. before/after intervention
What is a t-distribution?
When is it used?
Used when the population SD is UNKNOWN and estimated from the sample
Similar to normal distribution, but with fatter tails. Exact shape depends on sample size and degrees of freedom
Becomes similar to normal distribution (at > 30 observations)
z statistic
- What is it?
- When can it used?
- What is it used for?
Measures the number of SD a data point or sample mean is from the population mean
Population distribution is normal
SD should be known
(if unknown, use t-statistic)
Used in statistical hypothesis testing and CI estimation
What is non-parametric statistics?
Branch of statistics not solely based on parametric statistics (mean, variance)
Non-parametric statistics: being either distribution free or having distribution with unspecified parameters (non-normal distributions, when assumptions violated)
Includes descriptive statistics, statistical inference (categorical, ordinal data)
Non-parametric tests
- Wilcoxon rank-sum test
- Wilcoxon signed-rank test
- Mann-Whitney U test
- Kruskal-Wallis test
Chi-square test
Fisher’s exact test
Compares categorical variables
Fisher’s exact test used with small sample sizes, 2x2, uses OR
What test used for multiple comparison testing?
ANOVA (parametric): comparing multiple groups to each other
Only indicates that a significant difference between groups exists (or does not exist); does not reveal which groups differ
What is considered “significant” correlation?
Correlation coefficient > 0.3
Regression
Examines relationship between 1+ independent variable & dependent variable (how y changes based on x)
Best fit line/curve
Simple linear - single independent and single dependent variable
Multiple linear - multiple independent variables
Non-linear (includes logistic)
Logistic regression
A type of non-linear regression
Dependent variable is categorical/binary
“Adjusted” odds ratio (adjusts for specific confounding variables)