Biostatistics Flashcards
Collects data from a group of people to assess frequency of disease (and related risk factors) at a particular point in time
-asks “What is happening?”
Cross-sectional study
A cross-sectional study can show risk factor association with disease, but does not establish
Causality
Compares a group of people with a disease to a group of people without a disease
- Looks for prior exposure or risk factor
- Asks “What happened?”
Case-control study
Case-control studies measure
Odds ratio
Compares a group with a given exposure or risk factor to a group without such exposure.
-Looks to see if exposure affects the likelihood of disease
Cohort study
Asks “Who will develop the disease?
Prospective cohort
Asks “Who developed the disease? the exposed or unexposed?”
Retrospective cohort
What type of study would say “Patients with COPD had higher odds of a history of smoking than those without COPD?”
Case-control Study
What type of study would say “Smokers had a higher risk of developing COPD than nonsmokers?”
Cohort Study
Compares the frequency with which both monozygotic or both dizygotic twins develop the same disease
-Measures heritability and influence of environmental factors
Twin concordance study
Compares siblings raised by biological vs. adoptive parents
-measures heritability and influence of environmental factors
Adoption study
Experimental study involving humans. Compares therapeutic benefits of two or more treatments or of treatment and placebo
-Study quality improves with double blinding and randomization
Clinical Trial
Refers to the additional blinding of the researchers analyzing the data
Triple-blinding
Requires a small number of healthy volunteers and asseses safety, toxicity, pharmokinetics, and pharmodynamics
Phase I trial
Requires a small number of patients with the disease of interest and assesses treatment efficacy, optimal dosing, and adverse effects
Phase II trial
Requires a large number of patients randomly assigned either to the treatment under investigation or to the best available treatment (or placebo)
-compares new treatment to current standard of care
Phase III trial
Requires post marketing surveillance of pateints after treatment is approved
- detects rare or long-term adverse effects
- can result in treatment being withdrawn from market
Phase IV trials
Sensitivity and specificity are fixed properties of a test, where as PPV and NPV vary depending on
Disease prevalence
Proportion of all people with the disease who test positive, or the probability that when the disease is present, the test is positive
Sensitivity
Highly sensitive tests are used for screening in diseases with
Low prevalence
Desirable for ruling out disease and indicates a low false-negative rate
Highly sensitive test
Proportion of all people without the disease who will test negative for the disease
Specificity
Desirable for ruling in a disease and indicates a low false-positive rate
Highly specific test
How an we remember what sensitive and specific tests are used for?
- ) Sensitive: SN-N-OUT
2. ) Specific: SP-P-IN
Used for conformation after a positive screening
Highly specific test
Proportion of positive test results that are true positives
Positive predictive value (PPV)
Proportion of negative results that are true negatives
Negative predictive value (NPV)
High prevalence means what for PPV and NPV?
High prevalence = High PPV and Low NPV
Looks at new cases
Incidence
Looks at all current cases
Prevalence
What is the incidence rate?
new cases / # people at risk
What is the prevalence?
of existing cases / Total # of people
Prevalence is greater than incidence for
Chronic diseases
Odds ratios are typically used in
Case-control studies
Relative risks are typically used in
Cohort studies
The difference in risk between exposed and unexposed groups, or the proportion of disease occurences that are attributable to the exposure
Attributable risk
The proportion of risk reduction attributable to the intervention as compared to a control
Relative risk reduction (RRR)
What is the formula for Relative risk reduction?
1-RR
The difference in risk attributable to the intervention as compared to a control
Absolute Risk Reduction (ARR)
Number of patients who need to be treated for 1 patient to benefit
Number needed to treat (NNT)
Number of patients who need to be exposed to a risk factor for 1 patient to be harmed
Number needed to harm (NNH)
The higher the precision, the higher the
Statistical power (1-β)
Random error decreases the
Precision
Systematic error decreases the
Accuracy
Error in assigning subjects to a study group resulting in an unrepresentative sample
-most commonly a sampling bias
Selection bias
What are three types of selection bias?
Berkson bias, healthy worker effect, and non response bias
When the study population selected from the hospital is less healthy then the general population
Berkson bias
When the study population is healthier than the general population
Healthy worker effect
When the participating subjects differ from nonrespondents in meaningful ways
Non-response bias
Awareness of disorder alters recall by subjects
-common in retrospective studies
Recall bias
When patients with a disease recall exposure after learning of similar cases
Recall bias
When information is gathered in a systematically distorted manner
Measurement bias
When subjects in different groups are not treated the same
Procedure bias
When the researcher’s belief in the efficacy of a treatment changes the outcome of that treatment (aka pygmalion effect)
Observer-expectancy bias
When a factor is related to both the exposure and the outcome but not on the causal pathway
-occurs when multiple factors distort or confuse effect of exposure on outcome
Confounding
When early detection is confused with increased survival
Lead-time bias
What are the three measures of central tendency
- Mean
- Median
- Mode
What are the measures of dispersion?
- Range
- Variance
- Standard deviation
- Standard Error
- Confidence intervals
What is most effected by outliers? Least effected?
- Most = mean
2. Least = mode
How much variability exists from the mean in a set of values
Standard deviation
An estimate of how much variability exists between the sample mean and the true population mean
Standard error of the mean
What are three types of non-normal distributions?
- bi-modal
- positive skew
- negative skew
Suggests two different populations
bimodal data
What are the characteristics of a positive skew?
Mean > median > mode
What are the characteristics of a negative skew?
Mean
Says there is no difference association between the disease and the risk factor
Null hypothesis
Says there is some association between the disease and the risk factor
Alternative hypothesis
Stating that there is an effect or difference when non actually exists
Type I (α ) error (i.e. false positive)
Stating that there is not an effect or difference between the null and alternative hypotheses when one exists
Type II (β) error (i.e. false negative)
There is less than a 5% chance that the data will show something that does not actually exist if
P is less than 0.05
The probability of making a type II error
β
The probability of rejecting the null hypothesis when it is false
Power (1-β)
Ranges of values within which the true mean of the population is expected to fall, with a specific probability
Confidence interval
What is the z value for
- 95% CI
- 99% CI
- z = 1.96
2. z = 2.58
There is no significant difference if the 95% confidence interval for a mean difference between 2 variables includes
Zero
We do not reject the null hypothesis if the 95% CI for OR or RR includes
1
If the CI’s between two groups do not overlap than
Statistically significant difference exists
If the CIs between two groups overlap than
No significant difference exists
Checks differences between means of 2 groups
t-test
Checks differences between means of 3 or more groups
Analysis of Vasiance (ANOVA)
Checks differences between 2 or more percentages or proportions of categorical outcomes (not mean values)
Chi-squared
Is always between +1 and -1. The closer the absolute value is to 1, the stronger the linear correlation between the two variables
Pearson correlation coefficient (r)
What is the coefficient of determination?
r^2