Biostatistics Flashcards
distribution terms
mean
median
mode
skew
mean
average value of a dataset
calculated by summing all values and dividing by the number of values
mean limitations
misleading in skewed distributions or distributions with outliers
median
middle value when a dataset is ordered from lowest to highest
when is median ideal
skewed distributions as it is not influenced by outliers
mode
the value that occurs most frequently in a dataset
ideal for skewed distributions as it is not influenced by outliers
skew
describes asymmetry in a distribution
positive skew
the right tail (higher values) is longer
many low values and a few extremely high values
mean > median > mode
negative skew
left tail (lower values) is longer
many high values and a few extremely low values
mean > median > mode
incidence
number of new cases of a condition in a given period
useful for assessing risk and evaluating interventions aimed at preventing disease
prevalence
total disease cases (new + pre-existing) in a population at one point in time divided by a total population
useful for planning health resource allocation and understanding disease burden
not impacted by disease duration or survival rates
point prevalence
percentage of people with the condition at one specific point in time
better reflects the burden of chronic conditions
lifetime prevalence
percent of individuals that ever had the condition at some point in their life
higher than point prevalence for chronic conditions
sensitive to survivorship and disease duration
key differences incidence vs prevalence
incidence assesses new case development over time
prevalence assesses existing disease cases at one time point
incidence excludes pre-existing cases, prevalence includes them
incidence assesses risk, while prevalence assesses burden
sensitivity vs specificity image
sensitivity
proportion of people with the disease who test positive on the assessment
conceptualized as the true positive rate
sensitivity formula
sensitivity = true positives / (true positives + false negatives)
high sensitivity
correctly identifies a high proportion of people who actually have the disease (few false negatives)
sensitivity example
Lyme disease screening test with 95% sensitivity would correctly identify 95% of people with Lyme disease
specificity
defined as the proportion of people without the disease who test negative on the assessment
also conceptualized as the true negative rate
specificity formula
specificity = true negatives / (true negatives + false positives)
high specificity
correctly rules out most people who do not have the disease (few false positives)
specificity example
a cognitive screening test for dementia with 98% specificity would generate few false positive results, correctly identifying 98% of patients without dementia as testing negative
positive predictive value (PPV)
defined as the probability that a person with a positive test result truly has the underlying disease
positive predictive value depends on
sensitivity, specificity, and disease prevalence
formula for positive predictive value
PPV = true positives/(true positives + false positives)
high positive predictive value
high probability of reflecting the true presence of disease
positive predictive value example
if a suicide risk screening test has a PPV of 90%, then 90% of patients screening positive are truly at high risk for suicide
negative predictive value
probability that a person with a negative test result truly does NOT have the underlying disease
negative predictive value depends on
sensitivity, specificity, and disease prevalence
negative predictive value formula
NPV = true negatives / (true negatives + false negatives)
high negative predictive value
a negative result reliably rules out the presence of disease
negative predictive value example
if a screening test for CJD has an NPV of 97%, only 3% of patients screening negative actually have CJD (low false negative rate)
case report/series
detailed description of a single clinical case or small group of cases
mainly descriptive with no comparisons to a control group
used to illustrate unique cases without evidence of causality
hypothesizes about ideas that can be investigated further with better quality research
case report/series example
a report of an individual patient diagnosed with Wilson’s disease that describes their symptoms, diagnosis, and treatment response
case-control study
compares cases (with an outcome) to controls (without outcome) to identify factors associated with the outcome
case-control study design
retrospective design: starts with the outcome and then investigates exposures
case-control study design useful for
studying rare diseases or outcomes with long latency periods
case-control study primary statistics
odds ratios quantifying the level of association
case-control study example
a study comparing the prevalence of chemical exposure at Camp Lejeune between patients diagnosed with Parkinson’s disease and healthy controls without the diagnosis
cross-sectional study
analyzes the relationship between exposures and outcomes at a single point in time
cross-sectional study useful for
disease prevalence and studying multiple outcomes
cross-sectional study cannot determine
temporal sequence between exposure and outcome
cross-sectional study primary statistics
prevalence ratios/odds ratios
cross-sectional study example
a study surveying the prevalence of essential tremor in octogenarians at a single point in time
cohort study
follows population prospectively to quantify outcome risk
groups are defined by exposure status
cohort study establishes
temporal relationship between predictors and outcomes
cohort study compared to cross-sectional study
more expensive and time-intensive
cohort study primary statistics
risk ratios quantifying relative risk
cohort study example
a multi-year study following a group of children into adulthood to track rates of diagnosis of multiple sclerosis and to identify predictive factors
randomized control study
gold standard experimental study in which participants are randomly allocated to study groups
highest internal validity due to randomization minimizing bias
randomized control study establishes
causality between intervention and outcome