Ch 5 & 6 & 7 study design, descriptive epidemiology, analytical Flashcards
• identify the principal epidemiological study designs
descriptive and analytical (ecological, cohort, cross sectional, case control)
• outline the steps for developing a study protocol
- study design
- study area and population
- eligibility criteria
- sample size calculation
- sampling methods
- data collection
- data/ statistical analysis
- ethics
• evaluate the role of different sampling strategies in obtaining representative data
simple random sampling systemic random sampling stratified random sampling cluster sampling multi-stage sampling
• recognize different data types and how best to report them
quantitative (discrete, continuous)
qualitative (binomial, ordered categorical, unordered categorical)
• describe the basic approaches to analyzing and interpreting epidemiological data
proportions
regression
meta- analysis
What are 2 important concepts related to sample size calculation?
Statistical power & precision.
Statistical power is the probability of detecting an effect if it is real.
Statistical precision is the probability of detecting an effect if it is not real.
Typically what power and precision are most common?
To increase the power and precision would require what?
It is typical to aim for a sample size with 80–90% power and 5% precision (significance) to detect a valid estimate of effect.
Larger sample sizes.
What are examples of sampling approaches?
Simple random sampling– random selection (shuffling cards and picking a card)
Systematic sampling– selecting samples at regular intervals, like if a proportion of 20% of kids
Stratified sampling– distinct groups with expected different outcomes, random sampling within each group/ strata
Cluster sampling – uses hierarchical structure of study group (households, and then also testing each person in that household)
Multi-stage sampling– uses hierarchical structure, but more than one stage of sampling (like randomly selecting schools, then randomly selecting kids in that school)
What are the different types of data that can be collected?
Indirect data– records, registries
direct data– surveys, blood specimens
environmental/ climate data – GPS, rainfall
Define– null hypothesis
Statistical tests start with a null hypothesis that there is no difference between two measures
What does the p-value tell us?
What’s a high/ low p-value tell us?
Tells us what the probability of observation seen is by chance. A larger p-value suggests lack of power, it doesn’t mean a true different doesn’t exist, just that the probability of chance is high. Whereas a low p-value means the difference observed is more likely true.
Define – correlation coefficient.
What symbol is often use to denote this?
What does a negative value mean?
Correlation coefficient is the strength and direction of the relationship between two variables. Denoted by an “r”. and can be any value between -1 to +1. If r = negative, then it means as one variable increases, the other decreases. If positive, then both variables increase together. R = 0 means there’s not relationship or not associated.
What is the function of regression analysis?
To compare multiple categories or look at a continuous exposure. Tries to describe how the variables are related and to predict the outcome.
What is a meta-analysis used for?
A way to get a combined estimate effect by combining several studies with same exposure- outcome association. This increases statistical power and gives a value that’s closer to the true value.
What is the goal of descriptive epidemiology?
Measure the frequency of outcomes and distribution in a population over time.
Identifies the:
Who
Where
When
- Where health resources should be allocated
- Identify patterns or anomalies
Define– secular trends
What can cause secular trends?
Refer to changes expected to be sustained for a long period of time.
Can be due to:
- public health interventions
- emerging threats
- change in awareness and data collection (ascertainment bias) and definitions
Define– standardization of a population
Equalizing the data with respect to an outcome modifying factor.
Define– standardization of a population
What are the 2 ways this can be done?
Equalizing the data with respect to an outcome modifying factor.
Direct
Indirect
What is needed in order to do direct standardization?
How is it done?
Need to know the outcome frequencies of each subgroup/ strata of the populations being compared.
These frequencies are compared to a “standard population” and a weighted frequency is calculated to come up with a “standardized frequency”.
What information is needed for “indirect standardization”?
How is this calculated?
We know population structure and total number of cases (we don’t know strata specific frequencies).
Strata specific frequencies from a comparison population structure used to calculate the expected frequencies. Expected number of cases then used to calculate a “standardized ratio” (SR). SRs of different pop shouldn’t be compared unless they have similar pop structures.
Which analytical study is best for group level analysis?
Ecological study.
Which analytical study is best for identifying possible epidemiological associations rapidly?
Cross-sectional study.
Which analytical study is best for following the natural course of exposure-outcome progression?
Cohort study.
Which analytical study is best for new and rare outcomes?
Case control study.
Which analytical study is best for group level analysis? And show existence of an association and give prelim data?
Ecological study.
Which analytical study is best for identifying possible epidemiological associations rapidly? And show existence of an association and give prelim data?
Cross-sectional study.
Which analytical study is best for following the natural course of exposure-outcome progression? Also provides more evidence of causality, but more complicated and expensive to carry out?
Cohort study.
Which analytical study is best for new and rare outcomes? Also provides more evidence of causality, but more complicated and expensive to carry out?
Case control study.
What is the purpose of analytical studies?
To detect and confirm hypothesises about certain associations in epidemiology.
Factors to consider when selecting a type of analytical study to carry out?
- existing info about the association of interest
- level at which the exposure is found
- expected frequency of outcome
- urgency of identifying the cause
- constraints on time, budget, staff
What are the key features of an ecological study?
- population based
- prevalence of cases
- group level effects
- measure correlation or any relative risk
What are the key features of cross-sectional study?
- outcome and exposure measured simultaneously
- prevalent cases
- measure association as prevalence ratio (or odds ratio if outcome is rare)
What are the key features of cohort study?
- Exposure determined before cases detected
- Incident cases
- Measure association as risk ratio or incidence rate ratio
What are the key features of case-control study?
- Participants selected by outcome; exposure determined subsequently
- Prevalent cases
- Measure association as odds ratio of exposure
What are the advantages of an ecological study?
- Relatively easy to collect data (routine)
- Rapid
- Relatively inexpensive
What are the advantages of a cross-sectional study?
- Easy to collect data
- Rapid
- Inexpensive
- Good for fixed exposures (e.g. genetic)
What are the advantages of a cohort study?
- Low probability of selection bias
- Low probability of confounding
- Multiple outcomes
What are the advantages of a case-control study?
- Can be rapid (good for outbreaks)
- Relatively inexpensive
- Efficient for rare outcomes, or long period between exposure and outcome
- Multiple exposures
What are the disadvantages of an ecological study?
- High probability of confounding
- Medium probability of selection bias
- Cannot show causality at the individual level (‘ecological fallacy’)
What are the disadvantages of a cross-sectional study?
- High probability of information bias (possibility of ‘reverse causality’)
- Medium probability of confounding
What are the disadvantages of a cohort study?
- Medium to high probability of loss to follow-up (for long studies)
- Logistically difficult
- Expensive
- Time-consuming
What are the disadvantages of a case-control study?
- High probability of selection and information bias
- Medium probability of confounding
- No data on frequency of outcome
define– ecological fallacy
drawing conclusions about individual associations using group level data. it’s assuming the individual is the “average person”
What are the main reasons to undertake an ecological study?
- group level data available
- exposure can only be measured at gp level (air pollution)
- data difficult to measure on individual level (etoh consumption)
- to study gp level effects (PH interventions)
What are some limitations of ecological study?
- prone to bias and confounding
- reasons for collecting the data may not be the same reasons for conducting a study
- different parts of an area collect differently
- population moves
- diagnostic criteria change over time
- can’t really compare this data to other places bc maybe collected differently
Can causality be inferred on a cross-sectional study?
No. Because exposure and outcome collected at the same time, so can not say if the exposure occurred before the outcome. This type of study doesn’t give info on temporality of events.
What is cross sectional studies usually used for?
- generating research hypotheses, to look more into whether there is a causality that is suggested by this type of study
- health service planning and monitoring
- measuring prevalent cases
- frequency of chronic outcomes
- markers of previous infections (antibody testing)
Define– reverse causality
If exposure changes overtime, especially in response to an outcome. (A disease causing a change in behavior.)
Ex– people with colon cancer may adapt the foods they eat to reduce discomfort, so current diet would not be a reasonable proxy for previous diet as a risk factor.
What type of bias are case control studies susceptible to?
Selection bias– systematic difference between the characteristics of the individuals sampled and the population from which individuals were selected.
Ex. saying exercise has healthy benefits, but the ppl who choose to exercise may inherently be a different type of person than one who doesn’t.
Information bias– misclassification of exposure or outcome. Examples are observer bias (systematically rounding some stuff up or down), responder bias (recall bias), and measurement bias (biased questionnaires).
What outcome measure is used for case-control studies?
odds ratio
What is overmatching?
Selecting too many characteristics to match between the exposed and control group. Groups thus aren’t different enough to elicit an association between exposure and outcome.