Biostatistics Flashcards
distribution terms
mean
median
mode
skew
mean
average value of a dataset
calculated by summing all values and dividing by the number of values
mean limitations
misleading in skewed distributions or distributions with outliers
median
middle value when a dataset is ordered from lowest to highest
when is median ideal
skewed distributions as it is not influenced by outliers
mode
the value that occurs most frequently in a dataset
ideal for skewed distributions as it is not influenced by outliers
skew
describes asymmetry in a distribution
positive skew
the right tail (higher values) is longer
many low values and a few extremely high values
mean > median > mode
negative skew
left tail (lower values) is longer
many high values and a few extremely low values
mean > median > mode
incidence
number of new cases of a condition in a given period
useful for assessing risk and evaluating interventions aimed at preventing disease
prevalence
total disease cases (new + pre-existing) in a population at one point in time divided by a total population
useful for planning health resource allocation and understanding disease burden
not impacted by disease duration or survival rates
point prevalence
percentage of people with the condition at one specific point in time
better reflects the burden of chronic conditions
lifetime prevalence
percent of individuals that ever had the condition at some point in their life
higher than point prevalence for chronic conditions
sensitive to survivorship and disease duration
key differences incidence vs prevalence
incidence assesses new case development over time
prevalence assesses existing disease cases at one time point
incidence excludes pre-existing cases, prevalence includes them
incidence assesses risk, while prevalence assesses burden
sensitivity vs specificity image
sensitivity
proportion of people with the disease who test positive on the assessment
conceptualized as the true positive rate
sensitivity formula
sensitivity = true positives / (true positives + false negatives)
high sensitivity
correctly identifies a high proportion of people who actually have the disease (few false negatives)
sensitivity example
Lyme disease screening test with 95% sensitivity would correctly identify 95% of people with Lyme disease
specificity
defined as the proportion of people without the disease who test negative on the assessment
also conceptualized as the true negative rate
specificity formula
specificity = true negatives / (true negatives + false positives)
high specificity
correctly rules out most people who do not have the disease (few false positives)
specificity example
a cognitive screening test for dementia with 98% specificity would generate few false positive results, correctly identifying 98% of patients without dementia as testing negative
positive predictive value (PPV)
defined as the probability that a person with a positive test result truly has the underlying disease