Ch 5 & 6 & 7 study design, descriptive epidemiology, analytical Flashcards
• identify the principal epidemiological study designs
descriptive and analytical (ecological, cohort, cross sectional, case control)
• outline the steps for developing a study protocol
- study design
- study area and population
- eligibility criteria
- sample size calculation
- sampling methods
- data collection
- data/ statistical analysis
- ethics
• evaluate the role of different sampling strategies in obtaining representative data
simple random sampling systemic random sampling stratified random sampling cluster sampling multi-stage sampling
• recognize different data types and how best to report them
quantitative (discrete, continuous)
qualitative (binomial, ordered categorical, unordered categorical)
• describe the basic approaches to analyzing and interpreting epidemiological data
proportions
regression
meta- analysis
What are 2 important concepts related to sample size calculation?
Statistical power & precision.
Statistical power is the probability of detecting an effect if it is real.
Statistical precision is the probability of detecting an effect if it is not real.
Typically what power and precision are most common?
To increase the power and precision would require what?
It is typical to aim for a sample size with 80–90% power and 5% precision (significance) to detect a valid estimate of effect.
Larger sample sizes.
What are examples of sampling approaches?
Simple random sampling– random selection (shuffling cards and picking a card)
Systematic sampling– selecting samples at regular intervals, like if a proportion of 20% of kids
Stratified sampling– distinct groups with expected different outcomes, random sampling within each group/ strata
Cluster sampling – uses hierarchical structure of study group (households, and then also testing each person in that household)
Multi-stage sampling– uses hierarchical structure, but more than one stage of sampling (like randomly selecting schools, then randomly selecting kids in that school)
What are the different types of data that can be collected?
Indirect data– records, registries
direct data– surveys, blood specimens
environmental/ climate data – GPS, rainfall
Define– null hypothesis
Statistical tests start with a null hypothesis that there is no difference between two measures
What does the p-value tell us?
What’s a high/ low p-value tell us?
Tells us what the probability of observation seen is by chance. A larger p-value suggests lack of power, it doesn’t mean a true different doesn’t exist, just that the probability of chance is high. Whereas a low p-value means the difference observed is more likely true.
Define – correlation coefficient.
What symbol is often use to denote this?
What does a negative value mean?
Correlation coefficient is the strength and direction of the relationship between two variables. Denoted by an “r”. and can be any value between -1 to +1. If r = negative, then it means as one variable increases, the other decreases. If positive, then both variables increase together. R = 0 means there’s not relationship or not associated.
What is the function of regression analysis?
To compare multiple categories or look at a continuous exposure. Tries to describe how the variables are related and to predict the outcome.
What is a meta-analysis used for?
A way to get a combined estimate effect by combining several studies with same exposure- outcome association. This increases statistical power and gives a value that’s closer to the true value.
What is the goal of descriptive epidemiology?
Measure the frequency of outcomes and distribution in a population over time.
Identifies the:
Who
Where
When
- Where health resources should be allocated
- Identify patterns or anomalies
Define– secular trends
What can cause secular trends?
Refer to changes expected to be sustained for a long period of time.
Can be due to:
- public health interventions
- emerging threats
- change in awareness and data collection (ascertainment bias) and definitions
Define– standardization of a population
Equalizing the data with respect to an outcome modifying factor.
Define– standardization of a population
What are the 2 ways this can be done?
Equalizing the data with respect to an outcome modifying factor.
Direct
Indirect
What is needed in order to do direct standardization?
How is it done?
Need to know the outcome frequencies of each subgroup/ strata of the populations being compared.
These frequencies are compared to a “standard population” and a weighted frequency is calculated to come up with a “standardized frequency”.
What information is needed for “indirect standardization”?
How is this calculated?
We know population structure and total number of cases (we don’t know strata specific frequencies).
Strata specific frequencies from a comparison population structure used to calculate the expected frequencies. Expected number of cases then used to calculate a “standardized ratio” (SR). SRs of different pop shouldn’t be compared unless they have similar pop structures.