Module 8 Flashcards
Descriptive epidemiology
Describes characteristics of a population
- Health needs
- Health events
- Health outcomes
Inferential epidemiology
Compares two or more populations for differences or similarities
Sampling
Done when data can’t be collected from entire population
- Subset of population provides estimate
- Sample is drawn from a sampling frame or list of those available to be sampled (ex: phone book)
Sampling methods
- Convenience sample
- Probability sample
- Systematic sampling
- Stratified random sampling
Convenience sample
Nonrandom selection, e.g., first 50 to enter clinic
Probability sample
- Uses some random mechanism to draw sample from sampling frame
- Each member in sample equally as likely to be chosen
Systematic sampling
e.g., Randomly choose one of first 20 patient charts, then every 20th chart thereafter to get a 5 percent sample
Stratified random sampling
Stratify sample into categories (e.g., age within gender) and then randomly sample from within each category
Statistical measures of effect
- Significance tests
- The p value
- Confidence Interval
Levels of measurement: 2 classes of data
- Continuous
2. Categorical
Continuous variable
- Distance between points meaningful
- Variable can take any value between points
- Age, height, weight, blood pressure
Categorical variable
Take values in fixed number of categories
- Ordinal—categories can be ordered in some way (ex: patient satisfaction —from not satisfied to very satisfied)
- Nominal—categories are “qualitative” (race, gender)
Descriptive statistics for continuous variables
- Measures of central tendency
2. Measures of dispersion/variation
Measures of central tendency
- Mean: average value
2. Median: half observation below, half above
Measures of dispersion/variation
- Standard deviation: average distance that variables fall
from the mean - Variance: square of standard deviation
- Range: distance from lowest to highest value
Descriptive statistics for categorical variables
- Frequency distribution presented graphically
- Proportion: number with attribute/total #
- Rate: a proportion multiplied by some number
Inferential statistics
Compare two or more samples for some characteristic
Tests specific hypotheses regarding populations
-Two-sided hypothesis: makes no assumptions regarding which population has the higher (or lower) value
-One-sided hypothesis: assumes one population has a higher or lower value
p Value
Probability of observed differences being due to random chance
- Statistically significant: p
Null hypothesis
States that there is no difference among the groups being compared
*Underlying all statistical tests is a null hypothesis
Significance tests
Used to decide whether to reject or fail to reject a null hypothesis
Significance level
Chance of rejecting the bull hypothesis when it is actually true
Confidence intervals
The test statistic +/- some quantity An alternative to the hypothesis test Provides a range in which the true value will probably lie Depend on: -Variability: (higher > larger CI) -Sample size: smaller > larger CI)
Statistical power
Ability of a study to demonstrate an association if one exists; probability that we will correctly determine that the null is false and reject it
Determined by:
-Frequency of the condition under study – Magnitude of the effect
-Study design
-Sample size
Two-sample T test
Compare mean values of a continuous value from 2 groups Need to know: -Mean of each group -Size of each group -Variance for each group
Z test for difference in proportion
Compare proportion with attribute from two populations
Need to know:
-Proportion with attribute in each population
-Size of each population
Study designs (overview)
- Experimental studies
- Observational studies
– Descriptive studies: cross-sectional surveys
– Analytic studies: many ecologic studies, case-control studies, cohort studies
Descriptive studies
– Used to identify a health problem that may
exist
– Characterize the amount and distribution of disease
Analytic studies
– Follow descriptive studies
– Used to identify the cause of the health problem
Validity for etiologic inference
#1 Experimental study #2 Prospective cohort study #3 Retrospective cohort study #4 Nested case-control study
Ecologic studies
Correlations are obtained between exposure rates and disease rates among different groups or populations
*Unit of analysis is the group, not individual
Ecologic fallacy
Observations made at the group level may not represent the exposure-disease relationship at the individual level
*Occurs when incorrect inferences about the individual are made from group level data
Advantages of ecologic studies
– Quick, simple, inexpensive
– Good approach for generating hypotheses when a disease is of unknown etiology
Disadvantages of ecologic studies
– Ecological fallacy
– Imprecise measurement of exposure and disease
Cross-sectional study (PREVALENCE STUDY)
o Survey done at particular point in time
o Exp and dis measures obtained at individual level
o Exp and dis outcome determined simultaneously
o Cases of disease are prevalent (not incident)
o Single period of observation
o Both probability and non-probability sampling used
Uses of cross-sectional studies
– Hypothesis generation
– Intervention planning
– Estimation of the magnitude and distribution of a health problem
Limitations of cross-sectional studies
– Do not provide incidence data
– Cannot study low prevalence diseases
– Cannot determine temporality of exposure & disease
Case-control studies
- Compare persons with disease (cases) with those without disease (controls)
- Explore whether differences between cases and controls result from exposures to risk factors.
- Useful when population is not well-defined
Classification of case-control studies
- When disease is identified—PRESENT
- When exposure/treatment is recognized—PAST
- When analysis is conducted—PRESENT
When to use case-control studies
- Exposure data are difficult/expensive to obtain o Disease is rare
- Disease has long induction/latent period
- Little is known about disease
- Underlying population is dynamic
Advantages of case-control studies
– Tend to use smaller sample sizes than surveys or prospective studies.
– Quick and easy to complete.
– Cost effective.
– Useful for studies of rare diseases
Limitations of case-control studies
– Provide indirect estimate of risk
– Timing of exposure-disease relationship difficult to determine
– Representativeness of cases/controls often unknown
Selecting cases for CC Studies
o Cases usually have the disease o Define cases specifically o Signs/symptoms o Clinical exams o Diagnostic tests
Sources of cases for CC studies
o Clinic patient rosters
o Death certificates
o Surveys
o Cancer/birth defect registries
Sources of controls for CC studies
o Population controls
o Hospital/clinic controls
o Illnesses must be unrelated to exposure
o Illnesses must have same referral pattern to HCF
o Dead controls—some cases deceased
o Friend/spouse/relative controls
Sources of exposure info for CC studies
o In-person/telephone interview
o Questionnaires
o Existing datasets: medical records, pharmacy databases, registry, employment/insurance/birth/death records
o Biological specimens—biomarkers
Case control assumptions
o Frequency disease population is small
o Cases/controls are representative
o Can’t calculate relative risk (RR) directly
Relative risk in case control
RR = # times more likely cases are to get disease than controls given exposure
Odds ratio
An estimate of relative risk o When cases/controls are representative o If disease prevalence is small OR = 1 implies no association o If > 1.0, then increased risk o If
OR provides good approx of risk when:
– Controls are representative of a target population
– Cases are representative of all cases
– The frequency of disease in the population is small