Stats Flashcards
Count
Cannot be compared b/c they arise from populations of different sizes
Use when important to public health or to allocate resources
Ratio
Shows relative size of 2 values
Proportion
Numerator is subset of denominator
Dimensionless
Between 0 and 1
Rate
a/a+b (can be proportion, always a ratio) over an amount of time
Incidence
Frequency of the occurrence of new cases over a specified period of time
Measures appearance of disease
Cumulative incidence
Risk of probability of an individual getting a disease
Proportion: # of new cases of disease/# at risk at beginning of follow up or over a specified time period
Fixed populations
Incidence rate
# of new cases/sum of disease-free person-time over specified time period Takes into account population differences in periods of follow up
Person-time at risk
Sum of disease-free time in population
- Add individual risk periods (exact)
- Use average number of people multiplied by study duration
- Use average duration per person
Prevalence
Proportion of people in a population w/ the disease at a specified point in time
Measures existing disease
Describes health burden
Point prevalence
Proportion: # of existing cases/total population at a specified point in time
Period prevalence
Proportion: (# of existing cases + # of cases that occur during the interval)/population at midpoint of interval or avg population size
Prevalence-Incidence relationship
Prevalence depends on incidence and disease duration
P = ID
If a disease is of short duration, I ~ P
If a disease is chronic, P > I
Prefer incidence b/c interested in etiology and you don’t want to vary too many factors at the same time (birth defect problem)
Binary data
One of two answers
Nominal data
Categorical data w/ no order
Ordinal data
Categorical data w/ order
Continuous data
Data measured continuously or on integer scale
Frequency distribution
Means of describing categorical data
Must add up to 100%
Mean
Average
Limitations: sensitive to extreme values, not ideal for skewed data
Median
Middle value
Mode
Most often
Variance
Average of square of deviations about the sample mean
S^2 = (sum(xk -xbar)^2)/(n-1)
Negative skew
Number of outlying values on low end (hump is on right)
Positive skew
Number of outlying values on high end (hump is on left)
Standard deviation
Square root of variance
Std = sqrt((sum(xk -xbar)^2)/(n-1))
Normal distribution
Theoretical probability distribution that is symmetric about its mean and is “bell” shaped
Mean = Median = Mode
Standard normal distribution
Specific distribution with mean of 1 and Std of 1
68% of data w/in 1 std
95% of data w/in 2 std
99.7% of data w/in 3 std
Shapiro-Welk test
Null hypothesis = data are normally distributed
p < 0.05 means data are NOT normally distributed, reject null hypothesis
Screening
Presumptive identification of unrecognized disease or condition by application of tests, examination, or other procedures
Attempts to classify asymptompatic people as likely or unlikely to have disease
Goal is to delay onset of symptoms and prolong survival
Only done for healthy people
Primary Prevention
prevent disease before it starts
Secondary Prevention
delay symptoms
Tertiary Prevention
slow disease progression
Lead Time
duration of time by which diagnosis is advanced as a result of screening
Validity
Does the test measure what it’s supposed to measure?
Bullseye
Internal validity
Does the test measure what it’s supposed to measure?
External validity
Generalizability, how well does the result generalize to the population?
Reliability
Does the test give the same result over and over?
Sensitivity
Sensitivity = a / a + c
Number of people who screen positive over number of people who actually have the disease
Increase to prevent disease transmission
Sensitivity + FN = 1
Specificity
Specificity = d / b + d
Number of people who screen negative amongst those who don’t have disease
Increase for fatal disease w/ no treatment
Specificity + FP = 1
True positive
Individuals who test positive and have disease
True negative
Individuals who test negative and don’t have disease
False positive
Individuals who test positive and don’t have disease
Increased w/ increasing sensitivity
FP = b / b + d
Specificity + FP = 1
False negative
Individuals who test negative and half disease
Increased w/ increasing specificity
FN = c / a + c
Sensitivity + FN = 1
Overall Accuracy
Assesses proportion of true test results among all test results
Overall accuracy = A + D / A + B + C + D = TP + TN / TP + FP + TN + FN
Positive predictive value
Number of people w/ true disease who tested positive divided by number of people who tested positive
Likelihood of having true disease if you test positive
PPV = a / a + b
Negative predictive value
Number of truly non-diseased people who tested negative divided by number of people who tested negative
Likelihood of not having disease if you test negative
NPV = d / c + d