Semester 2 - Lecture Revision Flashcards
What is the definition of epidemiology?
Epidemiology is the study of the distribution and determinants of disease in a population
What relevant questions are we asking in epidemiology, which can aid us as clinicians?
- How often does the disease occur in the population?
- How reliable is that diagnostic test I have used to test that patient?
- What is the likely outcome for this patient, given their disease stage and risk factors? - prognosis
- What factors might increase the risk or severity of disease for this patient?
- What treatments are available and effective to treat this patient?
What is a cross-sectional study?
In a cross-sectional studies data are collected from each subject at one point in time
X-period of time - how many people have this disease and how many don’t
Information can be gathered about outcomes (e.g. diseases, infections, or other conditions of interest), about exposures (e.g. smoking, diet), or about both outcomes and exposures at the same time.
What is a cohort study?
Here the starting point is disease-free people, and we start by determining their exposure status and then following-up this population to see what happens to them over time
Moving in forward in time - how does ‘x’ exposure change the risk/prevelance of ‘y’ disease - compare the exposed and unexposed groups.
What is a case-control study?
Population of people with the disease and without the disease - control (need to be drawn form the same population)
Work back in time to look at their exposure levels - to find potential risk factors
Work back in time
What is a intervention study?
Intervention studies study the effects of exposures, so here the outcomes of interest are compared between the different exposure groups.
And in an intervention study, the exposure is something that a researcher or clinician has ‘intervened’ to give one arm of the study population - purposeful allocation of exposure
Why do we need to critically appraise studies?
What is critical appraisal?
- What is the research question and what study design was used?
- Who were the participants included in the study? How were they identified for inclusion? We would want to establish what the exposure and outcome were and how these were measured? And we would want to know how the data were analysed?
- Are participants in the study similar to the patients that you see? Whether all important outcomes were considered, so were harms as well as benefits considered?
Are there critical appraisal tools available online that we can use?
There are two UK based websites that provide critical appraisal tools for free, and together provide checklists for all the study designs that we are covering throughout this course.
- Critical Appraisal Skills Programme (CASP)
- Centre for Evidence based Medicine
Why is disease frequency important to epidemiology?
Measures of disease frequency help us to quantify how common a disease/outcome is in a specific population within a specific time frame.
What three variables are need to calculate disease frequency?
So underpinning all measures of disease frequency is the need for information on the …
- Number of cases of disease
- The size of the population
- The timeframe in which these have been measured.
Example - 10 cases of Covid-19 per 100,000 of the populaiton in December
Example shown - cases divided by population times by 1000 to get cases per 1000 of the population
What are the two main types of disease frequency measurments?
Prevalence - looks at the frequency of existing cases of disease
Incidence, - looks at the frequency of new cases of disease
Within incidence, there are two different measures of frequency – firstly we have risk and secondly we have rates
What is the definition of prevelance and how is it calculated?
Prevalence - quantifies the proportion of the study population who have a disease at a specific point in time, and therefore provides an estimate of the probability that an individual will have the disease at that point in time.
Note - The results are often multiplied by 100 so as to be expressed as a percentage.
What is the definition of risk and how is it calculated?
Risk is the proportion of people who become diseased during a specific time period and is calculated as…
Number of new cases of disease during a defined period divided by the population initially at risk of developing the disease
This provides an estimate of the risk that an individual will develop a disease over a specified period of time.
What is the definition of rates and how is it calculated?
When calculating rates, what does total person time at risk refer to (denominator)
Within a pre-set amount of time, e.g. a year, looking at how much time (e.g. months) a person is disease free but at risk. Once, someone gets a disease they no longer contribute to total person time at risk and instead are counted as 1 new case (numerator)
Example time period of 1 year, looking at malaria incidence.
Are prevelance, risk and rates all proportions?
No - Prevalence and risk are both proportions in that they will always vary between 0 and 1
Rates are not proportions, with the denominator being the total follow-up time in the study population. When calculating or reporting rates it is important to specify the time unit – so is your rate per person years or person months or person days
How do you calculate odds?
Odds can be calculated for both prevalence and risk
Instead of the denominator being the population at risk of disease, the denominator for odds is the population without the disease
What is the burden of disease? Why is it important?
Unfortunately, the world does not have unlimited resources for tackling disease and protecting health. This means that difficult decisions have to be made about how to divide up a finite resource among competing health issues and problems.
The Global Burden of Disease approach is an attempt to grapple with some of these challenges – aid in the allocation of resources
What is a Disability adjusted life year? What components make it up? How can it be used?
Measure - Disability adjusted life years (DALY)
1 DALY = 1 lost year of healthy life
DALY is made up of 2 components:
- Years of Life Lost (YLL) - a measure of premature mortality – comparing age of death to the benchmark highest life expectancy country (currently Japan – e.g. 86 years old)
- Years of Life lived with a Disability (YLD) - a measure of morbidity
Can be used in different ways…
a) Compare different diseases and their burdens
b) Compare the total burden of disease in a country
c) Compare disease burden between different time points – has it changed
How is years of life lost (YLL) calculated?
YLL = N x L, where:
* N = number of deaths
* L = standard life expectancy at age of death (Benchmark - Av. Life expectancy with COPD)
If the average age of death from COPD in a given population is 65 years and there are 100,000 COPD deaths per year in that population:
* YLL = (86 – 65) x 100,000
* YLL = 21 x 100,000
* YLL = 2,100,000
How is years of life lived with a disability calculated?
YLD = I x DW x L, where:
* I = number of incident cases
* DW = disability weighting
* L = average duration of the case until remission or death (years)
If there are 100,000 cases of severe COPD in a population and severe COPD has a Disability Weighting (DW) of 0.383 and an average duration (L) of 5 years:
* YLD = 100,000 x 0.383 x 5
* YLD = 191,500 years life with disability
Once you calculated YLL and YLD, how can you calculate the total burden of disease?
Burden of disease = YLL + YLD
Burden of severe COPD = YLL + YLD
Burden = 2,100,000 + 191,500
Burden = 2,291,500 DALYs
How are the disability weightings calculated?
In the earliest global burden of disease studies, a panel of clinical experts got together to decide all of this. They assigned a disability weighting to a long list of different disease states. Each state was assigned a weighting of between 0 (perfect health) and 1 (equivalent to death)
Ordinary people were also incorporated into the weighting to make it fairer – based on a study where people were given hypotheticals scenarios where people had a condition X or condition Y, from which they had to choose.
What is likely the greatest problem/fault with calculating disease burden?
Perhaps the greatest weakness is the reliability of the underlying data on which estimates depend.
How are simple diagnostic test studies conducted?
We take a sample of people or patients. We conduct the index test, which is our new test that we are assessing.
And then we conduct the reference test, which is best way to identify if people have or do not have a disease
What we end up with for each individual in our study population, is a result for the reference test and a result for the index test
Example case for diagnostic test study - rapid oral test for HIV called OraQuick Advance vs. Gold-standard Western Blot (reference test). What are the true/false positives and negatives.
Data from the study shown in a 2x2 table
We can see that 11 of the 13 participants who were HIV+ on the Western Blot, tested HIV positive using the OraQuick test - true positives
2 out of the 13 participants who were HIV+ on the Western Blot were incorrectly identified as HIV- - false negatives.
Of the 1061 participants that were HIV- according to our reference test, 1060 of these were negative using the OraQuick test - are our true negatives.
1 person who was HIV- but tested HIV+ using the OraQuick test - false positive.
What is sensitivity? Using the HIV data, how would you calculate sensitivity?
The closer the sensitivity is to 100%, the better the test is at ruling out disease when the test is negative - is the test good at picking up treu positive results.
What is specificity? Using the HIV data, how would you calculate specificity?
The closer the specificity is to 100%, the better the test is at ruling in a disease when it is positive - is it good at picking up negative results.
What is positive predicted value, Using the HIV data, how would you calculate PPV?
Out of the positive results give by the new test, how many are true positives.
It is important to note that positive predictive values are affected by the sensitivity and specificity of the result, as well as the prevalence of disease in the population.
The positive predictive value will increase with increased prevalence of disease
What is negative predicted value, Using the HIV data, how would you calculate PPV?
Like the positive predictive value, the negative predictive value is affected by the sensitivity and specificity of the result, as well as the prevalence of disease in the population.
The negative predictive value decreases with increased prevalence of disease.
How are the liklihood ratios for positive and negative results calculated?
Likelihood ratio for positive results, and is calculated by taking the ratio of the sensitivity for a test to 100 minus the specificity
If this ratio is greater than one, then the test result is associated with disease – and we can see for the OraQuick test that having a positive test result is associated with disease with a likelihood ratio for positive results of 846. Essentially, people with HIV are 846 times more likely to have a positive test than someone without HIV.
Likelihood ratio for negative results - This is calculated by taking the ratio of 100 minus the sensitivity to the specificity for a test.
If this ratio is less than one, then the test result is associated with not having disease – and we can see for the OraQuick test that having a negative test result is associated with not having disease with a likelihood ratio for 0.15.
What is the difference between internal and external validity?
What are four questions you should ask to assess the validity of a diagnostic study?
Recap - using this table calculate sensitivity, specificity, positive and negative predicative values.
What is an ROC curve used for when evaluating diagnostic tests?
Representation of diagnostic test performance
Plot sensitivity vs. 1-specificity
Perfect test - area under the curve = 1
Test as good as chance - AUC = 0.5
How is a reference range obtained?
We measure the analyte of interest in a sufficiently large sample of healthy individuals.
This is our reference population, and it should take into account factors such as age, sex, and ethnicity. We plot the values obtained against their frequency of occurrence.
For analytes that occur in a Gaussian distribution, the normal range is defined as 2-standard deviations - ecompasses 95% of the values in the reference population
How are references ranges decided when you have a skewed distribution?
Same applies - We use the upper and lower 2.5 percentiles as the cut-off points for our reference range.
What is the reference change value/critical difference?
What are the definitions of sensitivity and specificity?
Sensitivity is how good the test is picking up an individual with the disease.
Specificity measures how good is the test at identifying the absence of disease.
Apart from quantifying disease frequency/burden, what else is epidemiology interested in?
The other aim is to investigate the effect of potential risk factors on the frequency of disease
We want to assess whether a disease is more common in one exposed population compared with another unexposed population.
What is the difference between binary, categorical and numerical data?
First two columns there are only two response options – this is known as binary data, and is commonly what we work with in epidemiological analysis.
Third column - categorical - more than two response options - if the categories can be listed in order, as we can here, then this is known as an ordered categorical data. If the categories may be listed in any order, then this is called unordered categorical
Fourth column - numerical or quantitative data -numerical data can be classified as either discrete (whole numbers) or continuous.
What does the term association refer to?
Association can be used to describe the relationship between two variables, in which a change in one variable is associated with change in the other.
Look at the association between exposure and outcome
How we measure association depends on the
type of data we have for our exposure and outcome
What are the three ways we can compare the freqeuncy of disease between an exposed and unexposed group?
Measures of effect compare the frequency of disease in exposed and unexposed groups.
Risk ratio - risk of disease in the exposed group divided by the risk of disease in the unexposed group
Risk difference – also known as the attributable risk - is the excess risk in the exposed group that we estimate is due to being exposed – so to calculate this we take away the risk in the unexposed group from that in the exposed group
Attributable risk - calculated as the attributable risk, divided by the risk of disease among the exposed - interpreted as the % of the outcome which is attributable to the exposure amongst the exposed group.
What are three things do consider when reviewing epidemiological results?
What is the impact of…
1. Chance
2. Bias
3. Confounding variables
What is the p-value?
Used to make sure that the associaiton seen does not arise by chance
What confidence interval is typically used when reported results?
95% confidence interval
Larger our sample size the smaller the confidence interval as we can be more certain in our observed estimate.
What are the two types of bias that can be introduced into a study?
Different types of bias that can be introduced as part of the study design and can lead to an incorrect estimate of the effect of an exposure on an outcome.
Selection bias refers to error due to systematic differences in characteristics between those who take part in a study and those who do not, such that either the comparison groups are not comparable or the people recruited into the study are somehow different from the population that should have been included in the study.
Information bias is any error in the measurement of exposure or outcome that results in systematic differences in the accuracy of information - measurement error
What is a co-founder?
A third factor that is independently associated with both our exposure of interest and our outcome, but is not on the causal pathway.
How can the level of association be described for continious and categorical data?
A correlation coefficients (also called r in statistical notation) can be estimated to describe the association between two continuous variables such as weight and blood pressure.
When the outcome of interest is a categorical variable the distribution of exposures/risk factors can be compared in people with and without the outcome using descriptive statistics such as t tests or chi squared tests
Is there a statistical test for causality?
It is important to be aware that there is not a statistical test for causality and that all the relevant evidence needs to be considered to review the strength of support for a causal relationship
Causality is easeir to establish for communicable disease - Henle Koch’s Postulate
Harder to establish for non-communicable disease
What are the 9 criteria in the Bradford hill criteria for causality?
Does association infer causality?
No!
What is the gold standard for studying casual effect?
What is prognosis?
Define prognosis as an opinion based on medical experience of the likely course of a medical condition
What are different measures of prognosis?
Most commonly we measure prognosis in terms of survival, whether that be disease-free or overall survival.
What are the three ways we measure prognosis?
What are absolute rates? - measure of prognosis?
Absolute rates are probably the most commonly reported outcome measure in prognostic studies.
This is percentage of patients with a given outcome at a particular point in time.
Examples
1. Median - time point at which 50% of patients have experienced the outcome
2. Five-year survival - % of people surviving after 5 years
Simple but does not convey information well
What are relative rates? - measure of prognosis?
Relative rates - allows us to determine the risk of a particular individual developing an outcome of a disease compared to another individual based on particular characteristics.
Reported as an odds ratio or a hazard ratio from a multivariable model.
Example
Patients undergoing surgery in low-income groups were more likely to develop a surgical site infection than those in high-income settings - odds ratio of 1.6 or 60% increased risk
What are survival curves? - measure of prognosis?
Survival curves - allows us to graphically determine the cumulative events over time.
The left graph, we have a y-axis which is some outcome, which here is cumulative survival. Where one means all patients are alive and zero means that all patients unfortunately have died. And on the x-axis we have time.
This allows us for each time interval to determine the probability of patients surviving a particular disease.
Right graph - effective way of comparing two different treatments
What are the three types of bias that can influence prognosis?
What is lead time bias?
New diagnosis tool made - allows for earlier diagnosis which would appear to increase prognosis but in reality this is not true - just diagnosing earlier
What variables do we consider when estimating prognosis?
What are some important considerations when using prognostic measures?
What are clinical prediciton rules? What are they used for?
Usually they’re a collection of variables which are combined together and modelled in a particular population to predict an outcome or disease.
So for example, we may have a prediction model which contains five variables - used to predict disease severity on admission to hospital
Used for…
1. Aid in medical decision making
2. Supplement clinical investigations and/or additional investigations
3. Help stratify patients into groups
4. Provide thresholds for drugs and treatment
How are clinical prediction tools created?
How can we measure the performance of a clinical risk tool?
AUROC - sometimes known as concordance, C-statistic or discrimination - is the most commonly used metric to measure performance.
This is quite a classical graph demonstrating that AUROC - allows for comparison of risk tools
X-axis - false positive rate
Y-axis - true positive rate
Dashed line here, which shows in AUROC - or AUC - value of 0.5 - if a score runs across this dotted line, then it’s no better than chance at predicting the outcome in this set of patients
Why is calibration of a clinical risk tool important?
The reason calibration is important is that it estimates and demonstrates the risk profile across the entire cohorts.
Example
If we see on the left-hand side of this curve, for patients with a predicted probability of 0.0 to 0.1, they’re calibrated quite well (i.e. they run across this dotted line, which means that the predicted versus observed risk is about the same). But as we move up in terms of predicted and observed probability of the outcome, we can see that it’s skewed to the right.
Now, this is important because what we need to know for a tool is does it over or underestimate the risk of the outcome?
What are the phases of developing a risk score?
What is a clinical trial?
In clinical trials we undertake a scientific experiment where we intervene to allocate our study population to be exposed or unexposed and, if this is done well, this should mean that the exposed and unexposed groups are comparable across a range of characteristics.
As noted here clinical trials are often referred to as intervention studies or randomised control trials.
What is the PICO model used to assess clinical trials?
PICO stands for patient or population, intervention, comparison and outcome.
What should we do when we want to define a population for a clinical trial?
Selection bias occurs if the study population does not reflect a representative sample of the target population
What should we think about when looking at the intervention used in a clinical trial?
Type of intervention - doesn’t need to be a new medication
Specifics of intervention - dose, frequency, etc.
It is also important to consider whether the trial is aiming to measure efficacy or effectiveness of an intervention
Efficacy - Efficacy can be defined as the performance of an intervention under ideal and controlled circumstances, such as a clinical trial.
Effectiveness - effectiveness refers to performance under real-world conditions or in average clinical conditions - referred to as a pragmatic trial
What should we think about when looking at the randomisation used in a clinical trial?
Big thing - Randomisation!
Allocation to treatment or control arms - ensures fair comaprison and avoids selection bias
Many potential methods
What should we think about when looking at the comaparison used in a clinical trial?
The choice of a control intervention is a critical factor in the design, conduct and interpretation of the clinical trial.
A comparator could be…
1. No intervention
2. It could be a placebo or sham treatment which is an inactive therapy made to look identical to the active therapy.
3. Or the comparator could be another active intervention or the usual treatment
Three stypes of comaprison to active intervention
- Superiority - superior of other
- Equivalence - not too different
- Non-inferiority - not much worse
Note - trials are usually double blind - the trial participant and investigator/evaluator do not know which treatment a participant is receiving
What should we think about when looking at the outcome used in a clinical trial?
When reading or designing a clinical trial we need to be clear on what the primary and secondary outcomes are to answer the trial objectives.
These are outcomes that are measured during the course of the trial, and they define and answer the trial questions.
In general, a single outcome should be used to answer the primary objective, but there may be a number of additional outcomes (including on potential side effects of the intervention) and these are generally referred to as secondary outcomes.
What is protective efficacy? How is it calculated?
Specific measure that is often calculated for trials of preventive interventions - the protective efficacy
We can then calculate the protective efficacy (or effectiveness), by subtracting the risk in the intervention group from the risk in the placebo group, and then dividing this by the risk in the placebo group
Example - paracetamol reduced the number of headaches by 60%.
Does more healthcare expenditure equal better healthcare?
More expenditure does not necessarily equal better healthcare.
Plus there it’s a law of diminishing returns.
So what is absolutely clear is that we need a method for efficiently allocating our resources, if we want to maximise health gains for the population from our health care budget
In health economics, what does scarcity, utility and opportunity cost refer to?
Is opportunity cost easily identifiable when making decisions in healthcare?
Key feature of opportunity cost in healthcare is the opportunity cost is not readily identifiable against individuals.
As clinicians, we’re not forced to choose between individuals. We simply make a decision about whether or not allocate resource
What are the macro and micro levels of health economics?
Macro level or aggregate level - This is thinking about how much healthcare should we provide for the population and how do we organise care at a system level to maximise the health care
At the micro or individual level, this is thinking about individual patients, or indeed an average patient out of a population - we’re generally making a trade-off between different treatment
options and ultimately we’re deciding who should gain from investment decisions and who shouldn’t, who should live and die.
What is the key tool that we use to measure opportunity cost?
So in health economics, the key focus of what we’re trying to achieve is measuring opportunity cost (i.e. valuing health care).
The key tool of the trade that we use for this is cost-effectiveness analysis.
We think about cost-effectiveness with respect to a threshold, which is otherwise known as a willingness to pay threshold.
Above this threshold, we should be willing to pay for technology.
But if something is less cost-effective than a threshold, then it’s considered of insufficient value or incurs too much opportunity cost to be worthwhile in the NHS
How does the cost-effectivness threshold change across time?
Over time our cost-effectiveness threshold rises as we adopt new, more cost-effective technologies.
We can see that in order for a new technology to be considered cost-effective, it has to, as a minimum, be more cost-effective than our current least cost-effective technology.
What is the definition of the cost effectiveness threshold?
The cost-effectiveness threshold is the maximum amount the health service will pay per unit of health gained
How are quality adjusted life years calculated?
We can combine length of life and quality of life into a quality adjusted life year or QALY, which is simply an…
expected lifespan, multiplied by the quality of life weight averaged over that lifespan
What is the incremental cost-effectiveness ratio? How is it calculated?
Incremental cost effectiveness ratio, or the ICER, which compares the cost effectiveness or cost per QALY of a new intervention compared with current standard
The ICER equals new cost minus current standard care costs divided by new health benefits minus current standard health benefits.
What is the different tyes of quantitative and qualitative data we can collect?
For continious data….
1. How is central tendency measured?
2. How is dispersion summarised?
3. How is it represented?
For non-continious data….
1. How is it summarised?
3. How is it represented?