Statistics Flashcards
Sensitivity
Ability of a test to correctly identify patients with the disease
True positive rate =
True positives / (True positive + False negative)
snNout : in highly sensitive test, negative will rule out disease
Specificity
Ability of test to identify patients without a disease
True negative rate =
True negative / True negative + False positive
spPin: highly specific will rule in disorder
Positive predictive value
Probability a person having the disease will test positive
PPV= True positives / True positives + False positives
Negative predictive value
Probability a person not having the disease if the test is negative
NPV = True negative / (True negative + False negative)
Likelihood Ratio
If the test is positive the odds of patient having the disease
LR= Sensitivity ( 1-Specificity)
Accuracy
(True positive + True negative) / Population
Null hypothesis
Any difference between study groups is by chance. i.e no true difference
Alternate Hypothesis
Two study groups have a true difference
Type 1 error (alpha)
False positive
Type 2 error (Beta)
False negative
Probability of missing an effect that is really there
Power
Ability to detect a true difference in outcome between two arms
Probability a type II error will not occur
Power = 1 - Beta
*B usually arbitrarily set as 0.2, from postulation that type 1 error 4 x as serious as type 2 error: alpha x 4 = beta (0.05 x 4 =0.2)
Effect size
The quantitative measure of the magnitude of the difference between groups
P-value
Probability of results given a true null hypothesis
<0.05 is statistically significance: result due to chance is less than 1 in 20
How to calculate sample size?
- acceptable level of significance
- power of the study
- expected effect size
- underlying event rate in population (prevalence)
- standard deviation in population
Prevalence
Proportion of population with disease at a given time point
= number of existing disease/ population
Incidence
Rate of occurrence of new cases over a period of time
= number of new cases (in a given period of time) / population
Absolute risk
Incidence rate of outcome in a group
number of event during FU / number of persons event free at the start
Relative risk
Exposed group absolute risk/ Control group absolute risk
Absolute risk reduction
Change in risk of outcome after intervention
= AR of control group - AR of experimental group
Relative Risk Reduction
Absolute risk reduction / Absolute risk fo control group
Number needed to treat
Number of subjects must be treated for one extra person to experience benefit
= 1 / Absolute risk reduction
Odds ratio
Probability of event / Probability of non-event
used in cohort or case control study
When does Odds ratio = Relative Risk?
When incidence of disease if very small
What is a ROC curve?
Receiver operating characteristics curve; a graphical plot used to show the diagnostic ability of binary classifiers
What is the y and x axis of an ROC curve?
X: false positive rate (1- specificity)
Y: true positive rate (sensitivity)
How to interpret a ROC curve?
the better the test is, the closer it will lie to upper left corner fo graph
What is AUC?
Area under the curve: used to summarizse the performance of the test at various thresholds
>0.8 = good discrimination
<0.6 = poor discrimination
What are types of statistical data?
Numerical
Categorical
Ordinal
What are types of numerical data?
- Discrete: can be counted
- Continuous: cannot be counted, only described using intervals of real number line
What is categorical data?
Qualitative data
What is ordinal data?
Data where order of variables have a significance
Types of qualitative data
- Nominal
- Ordinal
What are the statistical tests to analyze significance of categorical data?
- unpaired
- Large sample: Chi square
- Small sample: Fisher’s exact test
- paired: McNemar’s test
What is parametric data?
Continuous data that is normally distributed (Gaussian)
How to test whether a set a data is parametric?
- Visualization
- Skewness
- Formal test for normality
What are the statistical tests used to compare means for parametric data?
- one sample: one sample t-test
- two groups: t-test
- more than two groups: ANOVA
What are statistical tests used for comparing means in non-parametric data?
- one sample: Wilcox
- two groups: Mann-Whitney
- More than two groups:
- unpaired: ANOVA, Kruskall-Wallis
- paired: Friedman’s test
List the nine Hill’s Criteria for causality
- Strength
- Consistency
- Specificity
- Temporality
- Biological gradient
- Plausibility
- Coherence
- Experimental evidence
- Analogy
List the nine Hill’s Criteria for causality
- Strength
- Consistency
- Specificity
- Temporality
- Biolgoical gradient
- Plausbility
- Coherence
- Experimental evidence
- Analogy