Observational study designs Flashcards
Define prevalence
Measure the frequency of “cases” in a given population at a designated time
Requires a suitable denominator (e.g. GP registered patients)
Expressed as a percentage
What are cross sectional studies
- Used to measure prevalence by testing individuals in a population individually
- Can also measure exposures
- Numerator (number of people with diagnosis)/ denominator (total population)
What is the different between point and period prevalence
Point prevalence- prevalence a moment in time
Period prevalence- for things that fluctuate e.g. hayfever
Give three strengths of cross sectional studies
- Measure prevalence and thus disease burden in whole populations and subpopulations
- Can compare prevalence in exposed and non exposed to risk factors
- Quick and inexpensive
- Can be used to initially explore a hypothesis, prior to another study type
Give three weaknesses of cross sectional studies
- Not suitable for rare diseases
- Not suitable for diseases of short duration
- Cannot separate cause and effect as they are measured at the same time
- Cannot measure rate of new cases arising and any changes therein
What are cohort studies?
- They measure incidence by following a group of people over time and the onset of a disease/ health event measured
- Incidence of disease is compared among those exposed and unexposed to a risk factor
What is incidence?
The number of instances of illness/disease onset, in a given period in a defined population
- the numerator is the number of new events in a population; the denominator is the average number of persons exposed to risk during this period
Consider the direction of association in relative risk
What does a relative risk of the following mean:
a) <1
b) RR=1
c) >1
a) Risk in exposed group less than the risk in non-exposed group. Therefore the exposure may be protective against the disease.= Negative association
b) Risk in exposed group equal to risk in non-exposed group. No association
c) Risk in exposed group greater than the risk in non-exposed group. The exposure may be a risk factor for disease (positive association)
What do the following relative risk scores mean about the strength of association?
a) RR=1.5
b) RR= 3.0
c) RR= 0.8
a) risk of outcome 50% greater in exposed than unexposed group
b) risk in exposed is 3x unexposed
c) risk of outcome 20% lower in exposed than unexposed.
State 3 strengths of cohort studies
- Able to calculate incidence and relative risk
- Can offer some evidence of cause- effect relationship
- Can identify more than one disease related to single exposure
- Good when exposure is rare
- Minimises selection and information bias
State 3 weaknesses of cohort studies
- Potential for losses to follow up
- Often requires large sample, can take time to complete
- Less suitable for rare diseases
- Expensive
- If retrospective, data availability and quality may be poor
What are case-control studies?
- two groups of participants are selected with conditon an without
- controls selected to be as similar as possible to the cases (e.g. age, gender, occupation, stage of illness)
- variables not of interest are matched at selection (potential confounders)
- exposures of interest are not matched
- past exposures in both groups are measured
State 3 weaknesses of case control studies
- cannot calculate prevalence or incidence
- less suitable for rare exposures
- can be hard to ensure exposure occurred before onset
- retrospective data availability and quality may be poor
- suitable control group may be difficult
State 3 strengths of case control studies
- can offer some evidence of cause-effect relationship
- can identify multiple exposures (both positive and negative associations, interactions)
- good when disease is rare
- minimises selection and information bias
- retrospective: cheaper and typically shorter in duration
What is the different between risk and odds?
What is an odds ratio?
Risk= outcome of interest/total number of all possible outcomes
Odds= outcome of interest/ outcomes not of interest
Odds ratio= odds in exposed/ odds in non-exposed
Define relative risk?
= incidence of disease in exposed divided by incidence of disease in unexposed
the risk is
How does relative risk compare to odds ratio?
Risk is calculated using the total population at risk of developing the disease. Noone starts with the disease at outset
In a CCS, participants are selected on basis of having a disease (or not)
- therefore we don’t know the size of population at risk/the absolute risk of developing disease
- OR always overestimates RR. Don’t use interchangeably, can be converted.
How do you calculate relative risk?
Incidence of adverse outcome in sample with one intervention/Incidence of adverse outcome in sample with another intervention
What is a p value?
When will you see it?
The probability that the difference observed could have occurred by chance if the groups compared were really alike.
e.g P 0.05= 1/20
Results of a comparative statistical test (t test/chi squared) have p values
What are confidence intervals?
The confidence interval describes the range of values with a given probability (e.g. 95%) that the true value of a variable is contained within that range.
When do we need confidence intervals?
- Measures of effect (group comparisons)
- Population estimates (single population parameters)
Define sensitivity
Proportion of people with disease correctly test positive for disease
A/(A+C) on 2x2 table for diagnostic test
Define specificity
Proportion of people without disease who test negative
D/(D+B) on 2x2 table for diagnostic tests
Typically, a hypothesis is that we will see a difference between two groups because of different interventions.
What is a type 1 error?
We observed a difference when there wasn’t really one e.g. our intervention was significantly better in our study, but this effect does not actually exist
The null hypothesis will be wrongly rejected
Typically, a hypothesis is that we will see a difference between two groups because of different interventions.
What is a type 2 error?
We didn’t observe a difference when there actually was one e.g. our interventions looked equivalent but actually the new intervention is better
Null hypothesis is wrongly accepted
What is the different between the significance level and the power of the study?
The significance level is the rate at which we say we are comfortable in making a type 1 error – type 1 error rate. Usually 5%
Power is the opposite of the type II error rate – i.e. it is the probability that a test will not miss an effect when an effect truly exists, therefore power tends to be set at 1 minus 0.2 – so 0.8 or 80%
What is a positive predictive value?
What is a negative predictive value?
What doe these factors depend on?
Positive Predictive Value (PPV) = likelihood patient with positive test result actually has the disease
Negative Predictive Value (NPV) = likelihood patient with negative test result does not have the disease
Predictive values depend on the sensitivity and specificity of the test – and prevalence of the disease
What is the relationship between prevalence, PPV and NPV?
Why is this?
As prevalence increases, PPV increases and NPV decreases
Because underlying frequency of disease has increased in the given population
*sensitivity and specificity are independent of prevalence
What do you tell the woman with a positive mammogram?
Positive predictive value
- The likelihood of having the disease given that you have tested positive
“The probability that you really have the disease is 8.3%”
What is the difference between PPV/NPV and sensitivity/specificity?
PPV and NPV use the prevalence of a condition to determine the likelihood of a test diagnosing that specific disease. Whereas sensitivity and specificity are independent of prevalence.
What do you tell the woman with a negative mammogram?
“The probability that following your negative result you do not have the disease is 99.9”
What happens to PPV/NPV as prevalence
a) increases
b) decreases
a) PPV increases, NPV decreases
b) PPV decreases, NPV increases