1a Epidemiology Flashcards

1
Q

What is routine epidemiological data?

A

Non-targeted information that is obtained in a standardised and consistent manner

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Give some examples of routine epidemiological data?

A

Demographic data from census and population registers
Death certificates
Cancer registrations
Birth registrations
Congenital malformations registrations
Infectious disease notifications
Hospital episode data
Health surveys
Royal College of General Practitioners weekly returns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Where are most health statistics in England published?

A

The Office for National Statistics (ONS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Demography?

A

The scientific study of population statistics, including their size, structure, dispersal and development

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is demographic data important?

A

They form the baseline count of the total population being studied. Reliable denominators are necessary for the calculation of the various measures of disease frequency, including incidence and prevalence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the primary source of demographic data in the UK?

A

The national census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How often is the national census and when was the last one?

A

Every 10 years - 2021

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What data is collected in the national census?

A

Population counts by age and sex
Ethnicity
Country of birth
Accommodation
Education
Employment
Long-term illness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the problems with the national census?

A

Data is collected by each of the government census agencies (England and Wales, Scotland and Northern Ireland) separately.

Infrequent data collection means data can be outdated

Data may be incomplete for some population sub-groups, for example, those in hard-to-reach communities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are annual mid-year population estimates and how are they created?

A

Annual mid-year population estimates are estimated using the most recent census data but then accounting for births, deaths, net migration, and ageing of the population.

They are produced by the Office for National Statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the disadvantages of annual mid-year population estimates?

A

In areas with high migration rates (e.g. Urban areas) there may be potential inaccuracies in the estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is mortality data collected in England?

A

Mortality information is derived from the registration of deaths certified by an attending medical practitioner or coroner (Procurator Fiscal in Scotland).

Death certificates include information on the immediate and underlying causes of death, age at death, sex, address and occupation.

Death certificates are sent to the Office for National Statistics (ONS) where the underlying causes of death are classified according to the Tenth Revision of the International Classification of Diseases (ICD-10). Resulting in a complete and continuous set of mortality data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What mortality data is published in the UK and where is it published?

A

Annually mortality statistics for England and Wales, are published by the Office for National Statistics (ONS). Published reports include:

Mortality Statistics - Deaths Registered in England and Wales
Presents statistics on deaths occurring annually in England and Wales. Data includes death counts and rates by sex, age-group, and underlying cause.

Child Mortality Statistics - Childhood, Infant and Perinatal Deaths in England and Walkes
Presents detailed analyses of all stillbirths, infant and perinatal deaths and data on deaths of children < 16 years by cause of death, sex and age-group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the disadvantages of mortality data in the UK?

A

Risk of poor diagnostic accuracy

Varied certifying experience of the attending medical practitioner

Possible incorrect classification and coding of the death certificate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are some examples of morbidity data in the UK?

A

Cancer statistics registrations
National congenital anomaly and rare disease registrations
Statutory notifications of infectious disease
Laboratory reporting of microbiological data
General practitioner clinical codes (SNOMED CT)
Hospital episode statistics
Data from health surveys (for example, Health Survey for England)
Royal College of General Practitioner Research and Surveillance Centre weekly reports

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is cancer data in the UK collected and published?

A

Cancer registrations in England are conducted by eight independent regional registries that collect data on cancer cases in their regions. Regional registries supply a standard dataset (Cancer Outcomes and Services Dataset) monthly to the National Cancer Registration Service run by Public Health England for the provision of cancer statistics.

These data are published annually by the ONS (2 years after the year in which the cancer was diagnosed) in:

Cancer Statistics: Registration Series
Cancer Registrations Wales
Cancer Registrations Scotland
International Agency for Research on Cancer (IARC)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the issues with cancer data in the UK?

A

Cancer registries may differ with respect to methods of data collection, completeness of registrations or recording of data items.

Submission of data to the registries is voluntary.

Possible misclassification of cancer cases or changes in coding systems over time may affect the reliability of the data particularly when examining trends over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the issues with cancer data in the UK?

A

Care is needed when examining trends over time or between different regions. For example, cancer registries may differ with respect to methods of data collection, completeness of registrations or recording of data items.

Submission of data to the registries is voluntary.

In addition, misclassification of cancer cases or changes in coding systems over time may affect the reliability of the data, particularly when examining trends over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are Statutory Notifiable Diseases and how are they collected and published in the UK?

A

Certain infectious diseases must be notified to the proper officer of the relevant local authority.

All diagnostic laboratories in England must notify Public Health England (PHE) when a notifiable organism is confirmed.

Reports of notifications of infectious diseases (NOIDs) are published weekly and annually by PHE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are vital statistics and how are they collected and published in the UK?

A

Vital statistics are a set of data collected about “vital” life events. including data on live/stillbirth rates, fertility rates, maternity statistics, death registrations and causes of death.

Tables for England and Wales are produced by the ONS and are available for local authorities, health authorities and wards, with raw data being held by NHS Digital.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is The Health Survey for England?

A

The Health Survey for England (HSE) was established in 1991 by the Department of Health and Social Care, and is now carried out in conjunction with NatCen Social Research (an independent social research agency). It comprises a series of annual surveys about the nation’s health. The HSE is designed to be representative of the general population in England and aims to provide a measure of the health status of the population. Annual surveys cover the adult population aged 16 and older living in private households. Children have been included in the survey since 1995.

Each year the survey focuses on different demographic groups or diseases and their risk factors and looks at health indicators including cardiovascular disease, physical activity and eating habits. In addition to completing a health questionnaire, those surveyed are followed up by a nurse visit during which various physical measurements including blood pressure, lung function tests, blood and saliva are collected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the strengths of routine epidemiological data?

A

Readily available
Low cost
Useful for establishing baseline characteristics
Identify cases in a case-control study
Generating aetiological hypotheses
Derive expected numbers in a cohort study or as a source for ascertaining outcomes in a cohort study
Useful for examining trends of disease over time and by place

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the weaknesses of routine epidemiological data?

A

Lack of completeness, with potential bias
Often poorly presented and analyzed
Where there are small numbers of cases, it may be possible to identify individuals, threatening confidentiality
Data may not be collected in a uniform way across the entire population
Techniques of data collection may vary geographically, e.g. recording data, coding
Equivalent data not always available for all countries
Delay between collection and publication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the common patterns of disease incidence in relation to geographic place?

A

Variations in disease incidence by place fall under three main headings:

Broad geographical differences – sometimes related to factors such as climate, or social and cultural habits. Some cancers show marked geographical differences in incidence.

Local differences – distribution of a disease may be limited by the localisation of the cases, for example a contaminated water supply.

Variations within a single institution – variations in attack rates by hospital ward, for example, may help identify possible sources or routes of spread of a gastrointestinal infection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the common patterns of disease incidence with time?

A

There are three broad patterns of variation in disease incidence with time:

Secular (long-term) trends – changes in disease incidence over a number of years that do not conform to an identifiable cyclical pattern. For example, the secular trend in mortality from TB in England shows a steady decline over many years. However, this does not give any indication of the cause of the decline.

Periodic changes including seasonality – regular or cyclical changes in incidence, for example in infectious diseases. Cases of influenza typically reach a peak in the winter months.

Epidemics – strictly speaking, an epidemic is a temporary increase in the incidence of a disease in a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

In what ways can individual factors impact disease incidence?

A

Modifiable Risk Factors
Occupation
Marital status
Behavioural habits
Lifestyle

Non-modifiable risk factors
Age
Gender
Ethnic group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is prevelance?

A

Prevalence is the proportion of a population who have a disease/condition in a given time period.

There are two types of prevalence:
Point prevalence
Period prevalence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What are the types of prevalence?

A

Point prevalence
Period prevalence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is point prevalence?

A

The proportion of existing people with a disease in a defined population at a single point in time

Point prevalence = Number of cases in a single point of time/number of persons in a defined population at a defined point in time

The point in time that point prevalence refers to should always be clearly stated. Prevalence is a proportion, so has no units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the point prevalence of hypertension, in a town with 10,000 female residents on January 1st 2016, 1,000 have hypertension.

A

Point prevalence = Number of cases in a single point of time/number of persons in a defined population at a defined point in time

The prevalence of hypertension among women in town A on this date is calculated as:

1,000/10,000 = 0.1 or 10%

The point in time that point prevalence refers to should always be clearly stated. Prevalence is a proportion, so has no units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is period prevalence

A

Period prevalence is the number of individuals identified as cases during a specified period of time, divided by the total number of people in that population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the difference between point prevalence and period prevalence?

A

Point prevalence is the proportion of existing people with a disease in a defined population at a single point in time

Period prevalence is the number of individuals identified as cases during a specified period of time, divided by the total number of people in that population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is incidence

A

Incidence is a measure of the number of new cases of a disease (or another health outcome) that develop in a population of individuals at risk, during a specified time period.

There are three main types of incidence:
Risk (or cumulative incidence)
Incidence (or incidence rate)
Odds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What are the types of incidence?

A

There are three main types of incidence:
Risk (or cumulative incidence)
Incidence (or incidence rate)
Odds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is Risk (or cumulative incidence) and how is it calculated?

A

Risk (also known as cumulative incidence) refers to the occurrence of risk events, such as disease or death, in a group studied over time.

It is the proportion of individuals in a population initially free of disease who develop the disease within a specified time interval. Incidence risk is expressed as a percentage (or, if small, as “per 1000 persons”).

Risk = Number of new cases of a disease in a specified time period/Population at risk

Population at risk = The number of persons at risk but without the disease at the beginning of the time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What assumptions are made about a risk (or cumulative incidence) population?

A

Cumulative incidence assumes that the population at risk is a closed population. This means that the entire population at risk is followed up for the entire specified time period for the development of the outcome under investigation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is a closed population in an epidemiological study?

A

A population where every individual is followed up for the entire specified time period and no further participants join the study population i.e. there are no dropouts or new participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is a dynamic population in an epidemiological study?

A

A population where individuals can leave or join the study population. i.e. there are dropouts or where extra individuals join.

Causes of dropouts:
Some may develop the outcome of interest
Lost during follow-up
Refusal to continue to participate in the study
Migration
Death

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is Incidence (or incidence rate or rate) and how is it calculated?

A

The incidence rate is one of 3 measures of incidence. It measures the frequency of new cases of a disease in a population but considers the sum of the time that each participant remained under observation and at risk of developing the outcome under investigation. This helps to account for the varying time periods of follow-up.

Incidence rate = Number of new cases in a given time period/Total person-time at risk

Total person-time at risk = The sum of each individual’s time at risk (i.e. the length of time they were followed up in the study). It is commonly expressed as person-years at risk.

The incidence rate is the rate of contracting the disease among those still at risk. When a study subject develops the disease, dies or leaves the study, they are no longer at risk and will no longer contribute to person-time units at risk.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is person-time at risk, and when is it used?

A

The cumulative time spent “at risk” by individuals taking part in the study is expressed in time person years.

In a dynamic epidemiological study population, individuals in the group may have been at risk for different lengths of time, so instead of counting the total number of individuals in the population at the start of the study, the time each individual spends in the study before developing the outcome of interest needs to be calculated.

This is used to calculate the incidence rate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is the incidence rate of hypertension according to the results of this 5-year study?

Participant 1
Time spent in the study: 5 years
Status: Still in study

Participant 2
Time spent in the study: 4.5 years
Status: Developed hypertension

Participant 3
Time spent in the study: 3.5 years
Status: Developed hypertension

Participant 4
Time spent in the study: 1.5 years
Status: Developed hypertension

Participant 5
Time spent in the study: 3.5 years
Status: Lost to follow up

A

Incidence rate = Number of new cases of hypertension in the 5-year period/ Total person-time at risk during the 5 year period

Incidence rate = 3/18

Incidence rate = 0.167 per person-year (or 16.7 per 100 person-years)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What is odds and how is it calculated?

A

Odds is a measure of incidence.

Odds = Number of new cases of a disease in a specified time period/Number of people still disease free at the end of that time period

Instead of using the number of individuals who are disease-free at the start of the study (as is the case in incidence rate and risk rate), odds are calculated using the number disease-free at the end of the time period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is the difference between risk (cumulative incidence) and incidence rate (rate) in the following example?

A

Risk = number of (new) observed cases/number at risk (disease free) at the start

Rate = number of observed cases/person time (years) at risk

In the example, there are two deaths and a sample size of 7.

The total person time of follow-up is 2 years for individuals 1, 4 and 7; one-and-a-half years for persons 2 and 6 (person 6 was lost to follow-up after one-and-a-half years); and half a year for persons 3 and 5 (person 5 was lost to follow-up after half a year). In total this equates to 10 person years.

Risk = 2 / 7 = 0.29

Rate = 2 / (2 + 1.5 + 0.5 + 2 + 0.5 + 1.5 + 2 = 2/10 = 0.2 deaths/person-year

Note that the rate has units (cases per person per year), whereas risk does not, as it is a simple probability or proportion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is the difference between risk (cumulative incidence), incidence (incidence rate) and odds?

A

All three are measures of incidence.

Risk - The proportion of individuals in a population initially free of disease who develop the disease within a specified time interval.

Incidence Rate - The proportion of individuals in a population initially free of disease who develop the disease within a specified time interval, but also taking into account the sum of the time that each participant remained under observation and at risk of developing the outcome under investigation.

Odds - The odds of disease. Instead of using the number of individuals who are disease-free at the start of the study, odds are calculated using the number who are disease-free at the end of the time period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is the relationship between incidence and prevalence?

A

The relationship between incidence and prevalence can be expressed as;

P = ID

(P = Prevalence, I = Incidence Rate, D = Average duration of the disease)

Explanation:
If the incidence of a disease is low but the duration of the disease (i.e. the time until recovery or death) is long, the prevalence will be high relative to the incidence. An example of this would be diabetes.

Conversely, if the incidence of a disease is high and the duration of the disease is short, the prevalence will be low relative to the incidence. An example of this would be influenza.

A change in the duration of a disease, for example, the development of a new treatment that prevents death but does not result in a cure, will lead to an increase in prevalence without affecting incidence.

Fatal diseases, or diseases from which a rapid recovery is common, have a low prevalence, whereas diseases with a low incidence may have a high prevalence if they are incurable but rarely fatal and have a long duration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

How do you calculate a sex-specific mortality rate?

A

(The number of deaths in a specific sex group in a 1 year period/mid-year population of that sex group)*1000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

How do you calculate a birth rate?

A

(Number of births per year/mid-year population)*1000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

How do you calculate an age-specific mortality rate?

A

(The number of deaths in a specific age group in a 1 year period/mid-year population of that age group)*1000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

How do you calculate a fertility rate?

A

(The number of live births in a year/Mid-year population of women aged 15-44)*1000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

How do you calculate an infant mortality rate?

A

(The number of deaths in those < 1 year of age/number of live births in a year)*1000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

How do you calculate a perinatal mortality rate?

A

(The number of deaths in those < 1 year of age plus the number of stillbirths/number of live and stillbirths in a year)*1000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

How do you calculate a neonatal mortality rate

A

(The number of deaths in those under 28 days in a year/number of live births in a year)*1000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

How do you calculate a case fatality rate

A

(The number of deaths in a year from a specific disease/number of cases of that disease)*1000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What are measures of effect size?

A

Measures of effect are used in epidemiological studies to assess the strength of an association between a risk factor and the subsequent occurrence of disease. This is done by comparing the incidence of disease in a group of persons exposed to a potential risk factor with the incidence in a group who have not been exposed.

Measures of effect size can be relative or absolute.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

What are the different measures of effect size?

A

Relative measures (also called measures of relative risk):
Risk ratio
Rate Ratio
Odds ratio

Absolute measures:
Attributable Risk (Risk Difference)
Attributable Risk Percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What are the measures of incidence, prevalence, effect size and population impact?

A

Incidence:
Risk (cumulative incidence)
Incidence (incidence rate)
Odds

Prevalence:
Point prevalence
Period prevalence

Effect Size:
Relative measures (also called relative risk measures):
Risk ratio
Rate Ratio
Odds ratio
Absolute measures:
Attributable Risk (Risk Difference)
Attributable Risk Percentage

Population impact:
Population-attributable risk/rate
Population attributable risk fraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What is the difference between relative and absolute measures of effect size?

A

Relative measures reflect the increase in the frequency of a disease in one population (e.g. exposed) versus another (e.g. not exposed), which is treated as the baseline.

Absolute measures indicate exactly what impact a disease will have on a population, in terms of numbers or proportion affected by being exposed.

For example, a study finds that having several CT head scans in childhood results in a three-fold increase of your risk of developing brain cancer as an adult. This sounds like a large increase, but because the absolute risk increase would be small (say, an increase of 0.5 cases per 10,000 children), the increased risk means one additional case of brain cancer per 20,000 children scanned.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What are measures of relative risk?

A

Measures of relative risk is the collective name given to the measures of relative effect size.

This includes risk ratio, rate ratio and odds ratio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

How do you interpret relative measures of effect size (also called relative risk)?

A

Measures of effect, such as the risk ratio, provide assessments of aetiological strength, or the strength of association between a risk factor and an outcome.

Relative risk of 1: The incidence of disease in the exposed and unexposed groups is identical. I.e. there is no association observed between the disease and risk factor/exposure.

Relative risk >1: The risk of disease is greater among those exposed and indicates an increased risk among those exposed to the risk factor compared with those unexposed (also called positive association).

Relative risk <1: The risk of disease is lower among those exposed and indicates a decreased risk among those exposed to the risk factor compared with those unexposed (also called negative association).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What is the difference between risk ratio, rate ratio and odds ratio?

A

They are all relative measures of effect size, however, each uses a different measure of incidence to measure the difference between groups (risk, incidence rate and odds)

Risk Ratio (aka relative risk): The risk of developing disease in the exposed group divided by risk in the unexposed group

Rate Ratio: The ratio of the rate of an event in one group (exposure or intervention) to that in another group (control).

Odds ratio: The odds of an event (e.g. disease) occurring given a certain exposure vs. the odds of an event in the absence of that exposure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

How to calculate a risk ratio?

A

Risk = Number of new cases of a disease in a specified time period/Population at risk.

To calculate a risk ratio you simply calculate the ratio of risks between the exposed group and the unexposed group.

This can be done using a 2x2 contingency table:

                Outcome  No Outcome Total Exposure.           a               b.               a+b No Exposure.    c              d.                c+d Total                   a+c          b+d

Risk Ratio:
(a/a+b)
———
(c/c+d)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

How do you calculate an odds ratio?

A

Odds = Number of new cases of a disease in a specified time period/Number of people still disease-free at the end of that time period

To calculate an odds ratio you simply calculate the ratio of risks between the exposed group and the unexposed group.

This can be done using a 2x2 contingency table:

                Outcome  No Outcome Total Exposure.           a               b.               a+b No Exposure.    c              d.                c+d Total                   a+c          b+d

Odds Ratio:
(ad)
———
(b
c)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

What is the difference between Attributable Risk (Risk Difference) or Attribute rate (rate difference), and how do you know which to use?

A

The attributable risk and attributable rate are both measures of absolute effect.

They tell us exactly how many more people are affected in the exposed group, than in the unexposed. Giving the result in terms of the excess risk (or rate) caused by the exposure in the exposed group.

Attributable risk =Incidence risk in exposed-incidence risk in unexposed

Attirubutal rate = Incidence rate in exposed-incidence rate in unexposed

Which you pick will depend on the study design used, as well as whether the person-time at risk is known (as this is needed to calculate the rate).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

What is attributable risk and how do you calculate it?

A

The attributable risk (AR) is a measure of absolute effect.

It tells us exactly how many more people are affected in the exposed group, than in the unexposed. Giving the excess risk caused by exposure in the exposed group

Attributable risk =Incidence risk in exposed-incidence risk in unexposed

For example, in a cohort study, the AR is calculated as the difference of incidence risks.

An AR indicates the number of cases of the disease among the exposed that can be attributed to the exposure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

What is an attributable rate and how do you calculate it?

A

The attributable rate is a measure of absolute effect.

It tells us exactly how many more people are affected in the exposed group, than in the unexposed. Giving the excess rate caused by exposure in the exposed group.

Attributable rate =Incidence rate in exposed-incidence rate in unexposed

For example, in a cohort study, the AR is calculated as the difference in incidence rates.

An AR indicates the number of cases of the disease among the exposed that can be attributed to the exposure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

What is the attributable risk percentage (aka attributable fraction)?

A

The attributable risk percentage expressed the attributable risk in terms of the proportion of disease cases in the exposed group attributable to the exposure. This can be given as a fraction (attributable fraction) or percentage (attributable percentage).

I.e. the proportion of additional cases in the exposed group.

Attributal risk percentage = ((Risk in exposed group - Risk in unexposed group)/Risk in exposed group)*100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

What are measures of population impact and what are the different types?

A

Measures of population impact estimate the expected impact (i.e. extra disease) in a population that can be attributed to the exposure.

There are two main measures of population impact:
The population attributable risk
The population attributable risk fraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

What are the uses of measures of population impact?

A

Measures of population impact can:

Estimate how much of the disease in the population is caused by the risk factor

Estimate the expected impact on a population of removing or changing the distribution of risk factors in that population

Compare the population and unexposed (in comparison to measures of effect size which compare the exposed and unexposed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

What are the types of measure of Population Impact

A

Population attributable risk/rate
Population attributable risk fraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

What is Population attributable risk/rate and how is it calculated?

A

The Population attributable risk/rate (PAR) is a measure of population impact.

The population attributable risk (PAR) is the absolute difference between the risk (or rate) in the whole population and the risk (or rate) in the unexposed group.

It is used to estimate the excess rate of disease in the total study population that is attributable to the exposure. It provides a measure of the public health impact of the exposure in the population (assuming that the association is causal).

PAR = Risk (or rate) in the total population - Risk (or rate) in the unexposed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Population attributable risk fraction

A

The population attributable risk fraction (PAF) is a measure of population impact.

The PAF is the proportion of all cases in the whole study population (exposed and unexposed) that may be attributed to the exposure, as follows:

PAF = Population attributable risk/overall rate in the total population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

What are the problems with measures of population impact?

A

They assume that all of the association between the risk factor and disease is causal.

The results can vary according to how common exposure to the risk factor is in the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

What is standardisation of data and why is it done?

A

The comparison of crude mortality or morbidity rates is often misleading because the populations being compared may differ significantly with respect to certain underlying characteristics, such as age, sex, race or socio-economic status.
For example, an older population will have a higher overall mortality rate when compared to a younger population.

In reality, crude overall rates are simply a weighted average of the individual category-specific rates within a population. As such, where a locality has a large elderly population, the older age categories will carry greater weight than the younger age categories, giving the impression that the death rate in this area is unacceptably high, particularly in comparison with a youthful town (e.g. a university town).

Standardisation adjusts the crude overall rates to allow for a direct comparison.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

What are the methods of data standardisation?

A

There are three main methods of standardisation commonly used in epidemiological studies. These include:

Present and compare the age-specific rates (or whichever variable you want to standardise)
Direct Standardisation
Indirect Standardisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

What is the category-specific rate method of standardising data?

A

The category-specific rate method of standardising data is a way of standardizing data and results so that they are directly comparable.

This simply involves presenting and comparing the age-specific rates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

What are the strengths and weaknesses of the category-specific rate method of standardising data and what are the alternative options?

A

Strengths:
Simple
Quick to do
Allows for a more comprehensive comparison of mortality or morbidity rates between two or more populations

Weaknesses:
As the number of stratum-specific rates being compared increases, the volume of data being examined may become unmanageable.

Alternatives:
It may therefore be more useful to combine category-specific rates into a single summary rate that has been adjusted to take into account the population’s age structure or another confounding factor. This is achieved by using direct or indirect methods of standardisation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

What is the difference between direct and indirect standardisation and how do you know which to use?

A

Direct and indirect standardisation are the two main methods of standardisation.

Direct standardisation uses the category-specific rates (for example age-specific mortality) from both populations and applies these to a standard reference population. This allows you to work out what the mortality rate for this reference population would be, based on each population’s mortality rates, and you can then compare these numbers. The ratio of two directly standardised rates is called the Comparative Incidence Ratio or Comparative Mortality Ratio.

In indirect standardisation, you do the reverse. You find a reference standard of category-specific rates (for example age-specific mortality rates), and calculate what their expected mortality rate should be by applying these standard values to the populations in question. This calculated expected rate can then be compared with the overall observed rates. The ratio of two indirectly standardised rates is called the Standardised Incidence Ratio or the Standardised Mortality Ratio.

In general, direct standardisation is used when category-specific rates for both sets of data (e.g. age-specific rates) are available, and the indirect method is used when category-specific rates are unavailable. Indirect standardisation is also more appropriate for use in studies with small numbers or when the rates are unstable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

What is direct standardisation?

A

Direct standardisation is one of the two main types of standardisation.

The direct method of standardisation produces ‘age-adjusted rates’ that are derived by applying the category-specific mortality rates of each population to a single standard population. This ‘standard population’ may be the distribution of one of the populations being compared or may be an outside standard population such as the European Standard Population or the WHO’s World Standard Population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

What is indirect standardisation?

A

Indirect standardisation is one of the two main types of standardisation.

In indirect standardisation, you take a known set of category-specific rates (from either one of the populations being compared, or from a standard population) and apply these to the structure of each of the populations being compared.

This calculated expected rate can be compared with the overall observed rates to give a standardised morbidity/mortality ratio (SMR). Note that the SMR is always expressed as a percentage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

What are the steps of indirect standardisation?

A

1) Identify a standard reference for category-specific death rates, either from a reference or from one of the populations if you have this available.

2) Calculate the expected numbers of stratum-specific expected deaths.

3) Calculate the total number of expected deaths by summing the number of expected deaths in each stratum.

4) Calculate the standardised mortality rate (SMR) – the ratio between the observed and expected number of deaths (always expressed as a percentage)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

Use indirect standardisation to compare the age-standardised mortality rate for both countries, how do you interpret the results?

Country A:
Age group: (0-29) Number of deaths: (7000) Population: (6,000,000) Rate per 1000: (1.2)
Age group: (30-59) Number of deaths: (20,000) Population: (5,500,000) Rate per 1000: (3.6)
Age group: (60+) Number of deaths: (120,000) Population: (2,500,000) Rate per 1000: (4.8)
Total: Number of deaths: (147,000) Population: (14,000,000) Rate per 1000: (10.5)

Country B:
Age group: (0-29) Number of deaths: (6300) Population: (1,500,000) Rate per 1000: (4.2)
Age group: (30-59) Number of deaths: (3000) Population: (550,000) Rate per 1000: (5.5)
Age group: (60+) Number of deaths: (6000) Population: (120,000) Rate per 1000: (50)
Total: Number of deaths: (15,300) Population: (2,170,000) Rate per 1000: (7)

Hypothetical Standard Population:
0-29 - 100,000
30-59 - 65,000
60+ - 20,000
Total - 185,000

A

1) Identify a standard reference for category-specific death rates, either from a reference or from one of the populations if you have this available.

While a reference is not given in the question, you are able to use one of the countries as your reference (in this case we use country A)
0-29: 0.0012
30-59: 0.0036
60+: 0.048

2) Calculate the expected numbers of stratum-specific expected deaths.

Country A:
0-29: 0.0012 x 6,000,000 

= =7,200
30-59: 0.0036 x 5,500,000 

=19,800
60+ : 0.048 x 2,500,000 

=120,000

Country B:
0-29: 0.0012 x 1,500,000 

= =1,800
30-59: 0.0036 x 550,000 = 1,980
60+: 0.048 x 120,000 = 5,760

3) Calculate the total number of expected deaths by summing the number of expected deaths in each stratum.

Country A = 7,200 + 19,800 + 120,000 = 147,000
Country B = 1,800 + 1,980 + 5,760 = 9540

4) Calculate the SMR – the ratio between the observed and expected number of deaths. This needs to be in percentage form.

Country A:
(Observed(147,000)/Expected (147000))100 = 100%
Country B:
(Observed(15,300)/Expected(9540))
100 = 160%

Interpretation:
The number of observed deaths in Country B is 60% higher than what we would expect if Country B had the same mortality experience as Country A.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

What are the steps of direct standardisation?

A

1) Identify a standard population for which relevant stratum-specific data are available

2) Calculate the number of stratum-specific expected deaths for each data set.

3) Calculate the total number of expected deaths by summing all the values from the stratum-specific calculations.

4)Calculate the age-standardised rate by dividing the total number of expected deaths by the total standard population size.

5) Calculate the Comparative Mortality Ratio (CMR).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

Use direct standardisation to compare the age-standardised mortality rate for both countries, how do you interpret the results?

Country A:
Age group: (0-29) Number of deaths: (7000) Population: (6,000,000) Rate per 1000: (1.2)
Age group: (30-59) Number of deaths: (20,000) Population: (5,500,000) Rate per 1000: (3.6)
Age group: (60+) Number of deaths: (120,000) Population: (2,500,000) Rate per 1000: (4.8)
Total: Number of deaths: (147,000) Population: (14,000,000) Rate per 1000: (10.5)

Country B:
Age group: (0-29) Number of deaths: (6300) Population: (1,500,000) Rate per 1000: (4.2)
Age group: (30-59) Number of deaths: (3000) Population: (550,000) Rate per 1000: (5.5)
Age group: (60+) Number of deaths: (6000) Population: (120,000) Rate per 1000: (50)
Total: Number of deaths: (15,300) Population: (2,170,000) Rate per 1000: (7)

Hypothetical Standard Population:
0-29 - 100,000
30-59 - 65,000
60+ - 20,000
Total - 185,000

A

Step 1) Identify a standard population for which relevant stratum-specific data are available

This is given to you in the question

Step 2) Calculate the number of stratum-specific expected deaths for each data set.

For each age stratum of each population being compared, multiply the age-specific mortality rate by the size of the standard population for that stratum. This gives you the number of deaths one would expect in the standard population if it had the same mortality rates as your study population.

Country A:
0.0012 x 100,000 = 120
0.0036 x 65,000 = 234
0.048 x 20,000 = 960

Country B
0.0042 x 100,000 = 420
0.0055 x 65,000 = 357.5
0.05 x 20,000 = 1,000

Step 3) Calculate the total number of expected deaths by summing all the values from the stratum-specific calculations, above. This gives the total number of deaths that would be expected in the standard population if it had the same mortality rate as your study population.

Country A:
120 + 234 + 960 = 1314
Country B
420 + 357.5+1000 = 1777.5

Step 4) Calculate the age-standardised rate by dividing the total number of expected deaths by the total standard population size.

Country A:
1,314/185,000 = 7.1 per 1,000 pyrs
Country B:
1,777.5/185,000 = 9.6 per 1,000 pyrs

Interpretation:
After controlling for the confounding effects of age, the mortality rate in Country B is 35% higher than in Country A.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

What is the Comparative Mortality Ratio (CMR)?

A

The Comparative Mortality Ratio (CMR) is the ratio of two directly standardised rates. It gives a single summary measure that reflects the difference in mortality between the two populations.

It is calculated by dividing the overall age-standardised rate in, say, country B by the rate in country A.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

What is a standardised mortality rate (SMR).

A

A ratio between two indirectly standardised rates. It gives a single summary measure that reflects the difference in mortality between the two populations. It is always expressed as a percentage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

What is the comparative mortality ratio between country A (with an age-standardised mortality rate of 9.6) and country B (with an age-standardised mortality rate of 7.1)

A

Comparative Mortality Ratio = 9.6/7.1 = 1.35

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

Use the data below to explain the role that standardisation plays in data comparison.

Country A:
Age group: (0-29) Number of deaths: (7000) Population: (6,000,000) Rate per 1000 person-years: (1.2)
Age group: (30-59) Number of deaths: (20,000) Population: (5,500,000) Rate per 1000 person-years : (3.6)
Age group: (60+) Number of deaths: (120,000) Population: (2,500,000) Rate per 1000 person-years: (4.8)
Total: Number of deaths: (147,000) Population: (14,000,000) Rate per 1000 person-years: (10.5)

Country B:
Age group: (0-29) Number of deaths: (6300) Population: (1,500,000) Rate per 1000 person-years: (4.2)
Age group: (30-59) Number of deaths: (3000) Population: (550,000) Rate per 1000 person-years: (5.5)
Age group: (60+) Number of deaths: (6000) Population: (120,000) Rate per 1000 person-years: (50)
Total: Number of deaths: (15,300) Population: (2,170,000) Rate per 1000 person-years: (7)

A

The overall crude mortality rate is higher for country A (10.5 deaths / 1,000 person-years) compared with country B (7 deaths / 1,000 person-years), despite the age-specific mortality rates being higher among all age groups in country B.

The reason for the differences is that these two populations have markedly different age structures. Country A has a much older population than Country B. For example, 18% of the population in country A are aged over 60 years compared with just 5.5% of the population in country B.

Standardisation allows us to compare these populations and see what the adjusted mortality rate, taking into account these population differences, will be.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

What are the issues of data standardisation?

A

Standardised rates are used for the comparison of two or more populations; they represent a weighted average of the age-specific rates taken from a ‘standard population’ and are not actual rates.

Certain data is required to perform standardisation. For example, the direct method of standardisation requires that the age-specific rates for all populations are available and the indirect method of standardisation requires the total population size for each category.

As the choice of a standard population will affect the comparison between populations, it should always be stated clearly which standard population has been applied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

What are the different types of data?

A

Categorical:
Nominal
Ordinal
Binary

Numeric:
Discrete
Continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

What is nominal data?

A

A type of categorical data without an order.

Examples include blood groups (O, A, B, AB), eye colour and marital status.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

What is ordinal data?

A

A type of categorical data, where categories have an innate order in which they can be ranked. The “distances” between the different groups can be variable.

Examples include stages of breast cancer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

What is binary data?

A

Binary, or dichotomous, data is a type of categorical data where there are only two possible outcomes.

Examples include Yes/No or True/False or “survived” and “not survived”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

What is discrete data?

A

Discrete data is a type of numerical data.

It can only take fixed values. Examples include shoe size or number of people.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

What is continuous data?

A

Continuous data is a type of numerical data.

It can take any value, frequently within a given range. Examples include weight and length (where the range would be from zero to, theoretically, infinity).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

What are the different types of data scale and what do they mean?

A

Nominal - Naming variables in no particular order e.g. Eye colour

Ordinal - Ranking variables with an inherent order e.g. Breast Cancer Staging

Interval - Ranking variables with a set distance between each group e.g. An example is temperature measured in degrees Celsius. The difference between 10°C and 20°C is the same as the difference between 30°C and 40°C – so the differences are meaningful. However, the 20°C is not twice as hot as 10°C, so the ratios are not meaningful.

Ratio - Ranking variables on a scale with measurable intervals. Ratio data have a true zero and both differences and ratios are meaningful. An example is weight. The difference between 1kg and 2kg is the same as the difference between 3kg and 4kg. In addition, 2kg is twice as much as 1kg, and 10kg is twice as much as 5kg – so ratios are meaningful.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

What is years of life lost (YLL)

A

A summary measure of premature mortality.

It estimates the years of potential life lost due to premature deaths taking into account the age at which deaths occur, giving greater weight to deaths at a younger age and lower weight to deaths at an older age.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

What are the uses of “Year of life lost”

A

You can calculate the YLL of a specific cause of death as a proportion of the total YLL lost in the population due to premature mortality.

This can be used in public health planning to:

Compare the relative importance of different causes of premature deaths within a given population
Set priorities for prevention
Compare the premature mortality experience between populations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

How is “Years of life lost” calculated?

A

Summing the number of deaths at each age between 1-74 years, multiplied by the number of years of life remaining up to the age of 75 years.

The number of years of life remaining upper approximates life expectancy in a given population and any upper age limit could potentially be used.

Deaths at age <1 year are excluded as they are often related to causes originating in perinatal period such as congenital anomalies or prematurity.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

How would you calculate the “Years of life lost” contribution for 10 children who died at the age of 1 year?

A

Number of deaths at the age of 1 year x The number of years lost had each individual lived to the age of 75
= 10 x 74 years
= 740 years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

What is the “Crude years of life lost rate” and how is it calculated?

A

An expression of the years of life lost value that is given in comparison to the total population aged under 75 years.

Effectivley converting YLL into a value per X persons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

How do you calculate the crude years of life lost rate?

A

Crude Years of life lost rate = (Years of life lost/population under 75 years) x 10,000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

What is disease burden?

A

Disease burden is the impact of a health problem on a given population.

It can be measured using a variety of indicators such as mortality, morbidity or financial cost.

Measuring this allows the burden of disease to be compared between different areas, for example, regions, towns or electoral wards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
103
Q

What are the different measures of disease burden?

A

Multiple different measures can be used to monitor disease burden, such as mortality, morbidity or “years of life lost”.

However, the two best measures are:
Quality-Adjusted Life-Years (QALY)
Disability-Adjusted Life-Years (DALY)

These two are best as they allow direct comparison of the burden of different diseases and take into account both death and morbidity in a single measure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
104
Q

What are Quality-Adjusted-Life Years (QALY)?

A

Quality-Adjusted Life-Years (QALY) are a measure of the life expectancy corrected for the loss of quality of that life caused by diseases and disabilities.

QALY take into account both quantity and the quality of life generated by a healthcare intervention.

A year of life in perfect health is given a QALY of 1 whilst a year of complete functional impairment (e.g. death) has a QALY of 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
105
Q

What are Disability-Adjusted Life-Years (DALY)?

A

Disability-Adjusted Life-Years (DALY) reflect the potential years of life lost due to premature death (YLL) and equivalent years of ‘healthy’ life lost by virtue of being in states of poor health or disability. These disabilities can be physical or mental.

One DALY can be thought of as one lost year of a ‘healthy’ life.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
106
Q

What is the Global Burden of Disease Study?

A

The most well-known assessment of disease burden is the Global Burden of Disease (GBD) Study carried out by the World Health Organisation.

This is a regularly updated study that looks to provide age- and sex-stratified estimates of the burden of 333 leading causes of death and disability globally and for 195 countries and regions.

Started in 1990 and was most recently done in 2016.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
107
Q

What is the purpose of measuring disease burden?

A

Prioritising actions in health and the environment
Planning for preventive action
Assessing performance of healthcare systems
Comparing action and health gain
Identifying high-risk populations
Planning for future needs
Setting priorities in health research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
108
Q

What are the causes of variation in an epidemiological study?

A

Measurements Errors:
Measurement Error
Random error (chance)
Systematic error (bias)
Misclassification (Information bias)

Sampling Errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
109
Q

What is measurement error?

A

One of the causes of variation within a study.

Measurement error is the variability in a study caused by a lack of validity or reliability in the method of measurement of your inputs or outputs in a study.

For example, this could include having a broken blood pressure cuff or instead having a tester who did not know how to properly use one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
110
Q

What is validity?

A

The degree to which an instrument is capable of accurately measuring what it intends to measure. For example, how well a questionnaire measures the exposure or outcome in a prospective cohort study, or the accuracy of a diagnostic test.

There are 4 main types of validity:
Construct validity
Content validity
Face validity
Criterion validity

Assessing validity requires that an error-free reference test or ‘gold standard’ is available to which the measure can be compared.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
111
Q

What are the types of validity?

A

There are 4 main types of validity:
Construct validity
Content validity
Face validity
Criterion validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
112
Q

What is construct validity?

A

The extent to which the instrument specifically measures what it is intended to measure, and avoids measuring other things.

For example, a measure of intelligence should only assess factors relevant to intelligence and not, for instance, whether someone is a hard worker. Construct validity subsumes the other types of validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
113
Q

What is content validity?

A

Content validity describes whether an instrument is systematically and comprehensively representative of the trait it is measuring. For example, a questionnaire aiming to score anxiety should include questions aimed at a broad range of features of anxiety.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
114
Q

What is face validity?

A

Face validity is the degree to which a test is subjectively thought to measure what it intends to measure. In other words, does it “look like” it will measure what it should do. The subjective opinion for face validity can come from experts, from those administering the instrument, or from those using the instrument.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
115
Q

What is criterion validity?

A

Criterion validity involves comparing the instrument in question with another criterion which is taken to be representative of the measure. This can take the form of concurrent validity (where the instrument results are correlated with those of an established, or gold standard, instrument), or predictive validity (where the instrument results are correlated with future outcomes, whether they be measured by the same instrument or a different one).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
116
Q

How do you assess validity?

A

Validity is measured by sensitivity and specificity.

These can be calculated by two main methods
Comparing the test with the best available clinical assessment. For example, a self-administered psychiatric questionnaire may be compared with the majority opinion of an expert psychiatric panel.

Test its ability to predict some other relevant finding or event, such as the ability of glycosuria (glucose in the urine) to predict an abnormal glucose tolerance test, or of a questionnaire to predict future illness.

The above methods can then be plotted in a 2x2 contingency table, classifying positive or negative for the outcome, first on the basis of the survey or new instrument, and then according to the reference test.

                Ref Test Pos  Ref Test Neg Total New Test Pos.     a               b.               a+b New Test Neg.    c              d.                c+d Total                   a+c          b+d

Sensitivity (a/a+c) - a sensitive test detects a high proportion of the true cases

Specificity (d/b+d) - a specific test has few false-positives

Systematic error (a+b)/(a+c) - the ratio of the total numbers positive from the new test compared with the reference tests. This indicates the proportion of counts that were correct.

Positive predictive value - the proportion of test positives that are truly positive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
117
Q

How can the validity of a test be improved?

A

Training observers and considering the setting of observation
Ensure an appropriate and representative sample, and consider the effect of reflexivity (the effect of observation and the observer on participants)
Ensure the results of observations are accurately recorded, for example by having two observers, or by recording spoken responses
Triangulate responses by repeating observations, or by assessing the outcome of interest with additional instruments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
118
Q

What is Reliability?

A

Reliability, also known as reproducibility, refers to the consistency of the performance of an instrument over time and among different observers.

A highly reliable measure produces similar results under similar conditions so, all things being equal, repeated testing should produce similar results.

There are 4 main methods of testing the reliability of an instrument:
Inter-rater (or inter-observer) reliability
Intra-rater (or intra-observer) reliability
Inter-method reliability
Internal consistency reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
119
Q

What is Inter-rater (or inter-observer) reliability?

A

The degree of agreement between the results when two or more observers administer the instrument on the same subject under the same conditions.

Inter-rater reliability can be measured using Cohen’s kappa (k) statistic. Kappa indicates how well two sets of (categorical) measurements compare.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
120
Q

What is Intra-rater (or intra-observer) reliability and how is it measured?

A

Also called repeatability or test-retest reliability

This describes the agreement between results when the instrument is used by the same observer on two or more occasions (under the same conditions and in the same test population).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
121
Q

What is Cohen’s kappa (k) and how is it interpreted?

A

Cohen’s kappa (k) is a measure of inter-rater reliability.

It is more robust than simple percentage agreement as it accounts for the possibility that a repeated measure agrees by chance.

Kappa values range from -1 to 1, where values ≤0 indicate no agreement other than that which would be expected by chance, and 1 is perfect agreement.

Values above 0.6 are generally deemed to represent moderate agreement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
122
Q

What are the disadvantages of Cohen’s Kappa (k)?

A

It can underestimate agreement for rare outcomes
It requires the two raters to be independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
123
Q

What is Inter-method reliability?

A

Also known as equivalence.

This is the degree to which two or more instruments, that are used to measure the same thing, agree on the result.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
124
Q

How can a test’s reliability be improved?

A

Training of observers
Clear definitions of terminology, criteria and protocols
Regular observation and review of techniques
Identifying causes of discrepancies and acting on them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
125
Q

How are reliability and validity related?

A

What may be valid for a group or a population may not be so for an individual in a clinical setting. When the reliability or repeatability of the test is poor, the validity of the test for a given individual may also be poor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
126
Q

What is Internal consistency reliability and how is it measured?

A

This is the degree of agreement, or consistency, between different parts of a single instrument.

Internal consistency is measured using Cronbach’s alpha (α) – a statistic derived from pairwise correlations between items that should produce similar results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
127
Q

What is generalisability?

A

Also known as external validity.

The extent to which the findings of a study can be applicable to other settings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
128
Q

What makes a result generalisable?

A

To be generalisable, the results must have a suitable level of internal validity, after that, it is a judgement based off of:

The characteristics of the participants (including the demographic and clinical characteristics, as affected by the source population, response rate, inclusion criteria, etc.)
The setting of the study
The interventions or exposures studied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
129
Q

What elements make a study less generalisable?

A

Restrictions within the original study (eligibility criteria),
Pre-test/post-test effects (where cause-effect relationships within a study are only found when pre-tests or post-tests are also carried out).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
130
Q

What is Cronbach’s alpha (α) and how is it interpreted?

A

Cronbach’s alpha (α) is a measure of internal consistency.

The usual range for the alpha will be zero to one, with values above 0.7 generally deemed acceptable, and a figure of one indicating perfect internal consistency.

A negative value will occur if the choice of items is poor and there is an inconsistency between them, or the sampling method is faulty.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
131
Q

What is random error?

A

One of the causes of variation within a study.

Random error (also called chance) is the variation in the study caused by chance differences between the recorded and true values.

These variations may arise from unbiased measurement errors (e.g. weight of an individual can vary between measurements due to limited precision of scales) or biological variation within an individual (e.g. blood pressure or body temperature, which are likely to vary between measurements).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
132
Q

What is systematic error?

A

One of the causes of variation within a study.

Systematic error (also called bias) is the variation in s study caused by the consistent difference between the recorded value and the true value in a series of observations which results in some individuals being systematically misclassified.

For example, if the height of an individual is always measured when the person is wearing the same shoes, the measurement will be consistent, but the results will have a systematic bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
133
Q

What is misclassification in relation to variation within a study?

A

One of the causes of variation within a study.

Misclassification, also called information bias, refers to variation in a study caused by the misclassification of an individual, value or attribute into a category other than that to which it should be assigned.

This misclassification can be either differential or non-differential.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
134
Q

What is non-differential misclassification and what does it do to your results?

A

Non-differential (random) misclassification is a type of misclassification.

This involves the misclassification of variables with equal probability in all study participants, regardless of the groups being compared. That is, the probability of exposure being misclassified is independent of disease status and the probability of disease status being misclassified is independent of exposure status.

Non-differential misclassification increases the similarity between the exposed and non-exposed groups, and may result in an underestimate (dilution) of the true strength of an association between exposure and disease.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
135
Q

What is differential misclassification and what does it do to your results?

A

Differential (non-random) misclassification occurs when the proportion of subjects being misclassified differs between the study groups.

That is, the probability of exposure being misclassified is dependent on disease status, or the probability of disease status being misclassified is dependent on exposure status.

The direction of bias arising from differential misclassification may be unpredictable but, where it is known and quantifiable, differential misclassification may be compensated for in the statistical analysis.

Differential misclassification may be introduced in a study as a result of:

Recall bias (differences in the accuracy of recollections by study participants)
Observer/interviewer bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
136
Q

What is the difference between differential and non-differential misclassification?

A

In non-differential misclassification, the misclassification is the same between the study groups, whereas in differential misclassification the two groups are misclassified differently.

Non-differential misclassification only results in an underestimation of the study results whereas differential misclassification may result in an under- or overestimation of the true association.

Differential misclassification is considered to be worse than Non-differential misclassification.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
137
Q

What is Sampling Error and how may it affect your results?

A

One of the causes of variation within a study.

Sampling error is the variation in a study that occurs when the sample selected does not (by chance alone) represent the entire population of data.

Sampling errors may result in:
Type I error (α) - Rejecting the null hypothesis when it is true (a “false positive”)
Type II error (β) - Failing to reject the null hypothesis when it is false (a “false negative”)

138
Q

How is sampling error measured?

A

This difference between the sample taken and the true population is referred to as the sampling error.

The variability in the sampling error is measured by the standard error.

The rate of sampling error in a study is estimated using confidence intervals and p-values.

139
Q

How do you reduce sampling error?

A

Sampling error decreases as the sample size increases.

Sampling error cannot be fully eliminated, only reduced.

140
Q

In what ways can a subject cause variation in their own results?

A

Differences made on the same subject on different occasions may be due to several factors, including:

Physiological changes – e.g. blood pressure, pulse
Factors affecting response to a question – e.g. rapport with the interviewer
Changes because the participant is aware they are being studied – e.g. courtesy bias, giving the answer they believe the interviewer wants to hear

141
Q

In what way can an observer cause variation in results?

A

Variations in recording observations arise for several reasons including bias, errors, and lack of skill or training. There are two principal types:

Intra-observer variation - Inconsistency of an observer in recording repeat results
Inter-observer variation - Failure of different observers to record the same results

142
Q

In what way can technical issues cause a variation in results?

A

Technical equipment may give incorrect results for several reasons, including:

The method is unreliable – e.g. peak flow rate in asthma
Faults in the test system – e.g. defective instruments, poor calibration
Absence of an accurate test

143
Q

How do you reduce variability in measurements?

A

Prior to starting data collection, careful thought should be given to potential sources of error, bias and variation in measurements, and every effort made to minimise them.

Principles of avoiding unnecessary variation include:
Using clearly defined diagnostic criteria
Observing participants under similar biological/environmental conditions
Training observers
Blinding observers and participants to the study hypothesis
Using calibrated, easy-to-use equipment
Employing standardised measurement methods
Piloting questionnaires to identify ambiguous questions

Furthermore, when data are processed, sensitivity analyses should be conducted and presented to test how robust the study findings are to variations in, for example, classifications or assumptions.

144
Q

What are the possible causes for an observed association?

A

Chance (random error)
Bias (systematic error)
Confounding
Reverse causality
True causality

145
Q

What is reverse causality?

A

Where an association between an exposure and an outcome is not due to direct causality from exposure to outcome, but rather because the defined “outcome” actually results in a change in the defined “exposure”.

For example, a study may find an association between using recreational drugs (exposure) and poor mental wellbeing (outcome) and thus conclude that using drugs is likely to impair wellbeing. A reverse causation explanation could be that people with poor mental wellbeing are more likely to use recreational drugs as, say, a means of escapism.

146
Q

What can you do to assess if there is true causality in a relationship?

A

Apply the Bradford Hill criteria.

These criteria are widely used in epidemiology as a framework with which to assess whether an observed association is likely to be causal.

147
Q

What are the Bradford Hill criteria?

A

Strength of association – The stronger the association, or magnitude of the risk, between a risk factor and outcome, the more likely the relationship is thought to be causal.

Consistency – The same findings have been observed among different populations, using different study designs and at different times.

Specificity – There is a one-to-one relationship between the exposure and outcome. Note that this is uncommon in reality.

Temporal sequence – The exposure must precede outcome (to exclude reverse causation).

Biological gradient – Changes in the intensity of the exposure results in a change in the severity or risk of the outcome (i.e. a dose-response relationship).

Biological plausibility – There is a potential biological mechanism which explains the association.

Coherence – The relationship found agrees with the current knowledge of the natural history/biology of the disease.

Experiment – Removal of the exposure alters the frequency of the outcome.

Analogy – The relationship is in line with (i.e. analogous to) other established cause-effect relationships. For example, knowing of the teratogenic effects of thalidomide, we may accept a cause-effect relationship for a similar agent based on slighter evidence.

148
Q

What are the potential weaknesses of the Bradford Hill Criteria?

A

Rothman pointed out multiple weaknesses in the Bradford Hill Criteria:

The criteria were not designed to be a checklist, however, they are often used this way.

“Strength of association” does not account for the fact that not every component cause will have a strong association with the disease it produces. Also, the strength of the association also depends on the prevalence of other factors.

“Specificity”, suggests that a relationship is more likely to be causal if the exposure is related to a single outcome, however, a cause may have many effects, for example, smoking.

“Biological gradient” suggests that the plausibility of a causal association is increased if a dose-response curve can be demonstrated. However, such relationships may also result from confounding or other biases.

The only criterion that Rothman considered as a true causal criterion is ‘temporality’, however, it can be difficult to ascertain the time sequence for cause and effect.

149
Q

What is bias?

A

Bias is any systematic error in a study that results in an incorrect estimate of the true effect of an exposure on the outcome of interest.

150
Q

What are the main types of bias?

A

Information Bias:
Observer Bias
Interviewer Bias
Recall Bias
Social Desirability Bias
Performance Bias
Detection Bias

Selection Bias:
Sampling Bias
Loss to follow up
Allocation Bias

151
Q

What is Observer Bias?

A

A type of information bias.

Observer bias results from an investigator’s prior knowledge of the hypothesis under investigation or knowledge of an individual’s exposure or disease status.

This may influence the way information is collected, measured or interpreted by the investigator for each of the study groups.

For example, in a trial of a new medication to treat hypertension, if the investigator is aware which treatment arm participants were allocated to, this may influence their reading of blood pressure measurements. Observers may underestimate the blood pressure in those who have been treated, and overestimate it in those in the control group.

152
Q

What is Interviewer Bias?

A

A type of information bias.

Interviewer bias occurs when an interviewer asks leading questions that may systematically influence the responses given by interviewees.

153
Q

How do you minimise interviewer bias?

A

Blind observers to the hypothesis, exposure and outcome where possible.
Development of a protocol for the collection, measurement and interpretation of information.
Use of standardised questionnaires or calibrated instruments, such as sphygmomanometers.
Training of interviewers.

154
Q

What is Recall Bias?

A

A type of information bias.

Recall bias occurs when the information provided on exposure differs between the cases and controls. For example an individual with the outcome under investigation (case) may report their exposure experience differently than an individual without the outcome (control) under investigation.

This is more common in case-control studies where exposure is collected retrospectively. The quality of the data is therefore determined to a large extent on the patient’s ability to accurately recall past exposures and those with the outcome may be more prone to remember the exposure.

155
Q

How do you minimise recall bias?

A

Collecting exposure data from work or medical records.
Blinding participants to the study hypothesis.

156
Q

What is Social Desirability Bias?

A

A type of information bias.

Social desirability bias occurs where respondents to surveys tend to answer in a manner they feel will be seen as favourable by others, for example by over-reporting positive behaviours or under-reporting undesirable ones.

157
Q

What is Performance Bias?

A

A type of information bias.

Performance bias refers to when study personnel or participants modify their behaviour/responses when they are aware of group allocations.

158
Q

What is Detection Bias?

A

A type of information bias.

Detection bias occurs when the way in which outcome information is collected differs between groups.

159
Q

How do you limit detection bias?

A

Blind outcome assessors

160
Q

What is instrument bias?

A

A type of information bias.

Instrument bias refers to where an inadequately calibrated measuring instrument systematically over/underestimates measurement.

161
Q

How do you limit instrument bias?

A

Use standardised, calibrated instruments.

162
Q

What is Sampling Bias?

A

A type of selection bias.

Sampling bias describes the scenario in which some individuals within a target population are more likely to be selected for inclusion than others.

For example, if participants are asked to volunteer for a study, it is likely that those who volunteer will not be representative of the general population, threatening the generalisability of the study results.

Volunteers tend to be more health conscious than the general population.

163
Q

What is Loss to follow up bias?

A

A type of selection bias.

Loss to follow-up is a particular problem associated with cohort studies.

Bias may be introduced if the individuals lost to follow-up differ with respect to the exposure and outcome from those persons who remain in the study.

164
Q

What is attrition bias?

A

A type of selection bias.

Attrition bias is a particular problem associated with randomised control trials.

It is caused by the differential loss of participants from groups.

165
Q

What is Allocation Bias?

A

A type of selection bias.

Allocation bias occurs in controlled trials when there is a systematic difference between participants in study groups (other than the intervention being studied). This can be avoided by randomisation.

166
Q

How do you limit allocation bias?

A

Random Allocation

167
Q

What is information bias?

A

Information bias is one of the two main categories of bias.

Information bias results from systematic differences in the way data on exposure or outcome are obtained from the various study groups. This may mean that individuals are assigned to the wrong outcome category, leading to an incorrect estimate of the association between exposure and outcome.

Examples of information bias include:
Observer Bias
Interviewer Bias
Recall Bias
Social Desirability Bias
Performance Bias
Detection Bias

168
Q

What is selection bias?

A

Selection bias is one of the two main categories of bias.

Selection bias occurs when there is a systematic difference between either:
Those who participate in the study and those who do not (affecting generalizability).
Those in the treatment arm of a study and those in the control group (affecting comparability between groups).

Examples of selection bias include:
Sampling Bias
Loss to follow up
Allocation Bias

169
Q

Which type of studies are most and least effected by selection bias?

A

Case-control studies are the most affected. Controls need to be drawn from the same population as the cases (so they are representative of the population which produced the cases), however sampling from hospitals or clinics is common and results in the individuals being selected as controls being unrepresentative of the population that produced the cases.

Cohort studies are less affected than case-control, as participants are chosen before they develop the outcome.

170
Q

Which type of studies are the most and least affected by selection bias?

A

Case-control studies are the most affected. Controls need to be drawn from the same population as the cases (so they are representative of the population which produced the cases), however sampling from hospitals or clinics is common and results in the individuals being selected as controls being unrepresentative of the population that produced the cases.

Cohort studies are less affected than case control, as participants are chosen before they develop the outcome. However, they are still prone to “loss to follow up” bias, as those exposed may be more forthcoming and may be all found in the same place (for example all in one factory) and so are easier to follow up.

Randomized control trials are supposedly the least affected as everything is randomised, however, refusals to participate in a study, or subsequent withdrawals, may affect the results if the reasons are related to both exposure and outcome.

171
Q

What is The healthy worker effect?

A

The healthy worker effect is a potential form of selection bias specific to occupational cohort studies caused by the fact that working populations are inherently more healthy than the general population (because they are fit enough to work).

This means that mortality or morbidity rates in the occupation group cohort may be lower than in the population as a whole.

In order to minimise this, comparison groups should be selected from other groups of workers.

172
Q

What is a confounder?

A

Confounding is when an observed association is distorted due to the exposure also being correlated with another risk factor, hiding the true effect of the exposure. This risk factor is also associated with the outcome, but independently of the exposure under investigation.

An unequal distribution of this additional risk factor between the study groups results in confounding.

In order for a variable to be considered as a confounder:
The variable must be independently associated with the outcome (i.e. be a risk factor).
The variable must also be associated with the exposure under study in the source population.
The variable should not lie on the causal pathway between exposure and disease.

For example:
A study found alcohol consumption to be associated with the risk of coronary heart disease (CHD). However, smoking may have confounded the association between alcohol and CHD.
Smoking is a risk factor in its own right for CHD, so is independently associated with the outcome, and smoking is also associated with alcohol consumption because smokers tend to drink more than non-smokers.
Controlling for the potential confounding effect of smoking may in fact show no association between alcohol consumption and CHD.

173
Q

What are the possible effects of confounding?

A

An observed association when no real association exists.
No observed association when a true association does exist.
An underestimate of the association (negative confounding).
An overestimate of the association (positive confounding).

174
Q

How can you control for confounding factors?

A

At the design stage:
Randomisation
Restriction
Matching

At the analytics stage:
Stratification
Multivariable analysis
Standardisation

175
Q

How does randomisation reduce confounding factors?

A

Complete randomisation would mean that all potentially confounding variables, both known and unknown, should be equally distributed between the study groups.

This is generally seen as the best method for removing confounders in the design stage.

176
Q

How does matching reduce confounders?

A

Matching involves selecting controls so that the distribution of potential confounders (e.g. age or smoking status) is as similar as possible to that amongst the cases. In practice this is only utilised in case-control studies, but it can be done in two ways:

Pair matching - selecting for each case one or more controls with similar characteristics (e.g. same age and smoking habits)
Frequency matching - ensuring that as a group the cases have similar characteristics to the controls

177
Q

How does restriciton reduce confounding factors?

A

Restriction limits participation in the study to individuals who are similar in relation to the confounder. For example, if participation in a study is restricted to non-smokers only, any potential confounding effect of smoking will be eliminated. However, a disadvantage of restriction is that it may be difficult to generalise the results of the study to the wider population if the study group is homogenous.

178
Q

How does stratification reduce the effect of confounding?

A

Stratification allows the association between exposure and outcome to be examined within different strata of the confounding variable, for example by age or sex.

The strength of the association is initially measured separately within each stratum of the confounding variable.

Assuming the stratum specific rates are relatively uniform, they may then be pooled to give a summary estimate as adjusted or controlled for the potential confounder.

An example is the Mantel-Haenszel method. One drawback of this method is that the more the original sample is stratified, the smaller each stratum will become, and the power to detect associations is reduced.

It should be noted that this method should not be used to look for confounders, but instead reduce the effect of confounders you already suspect.

179
Q

How does Statistical modelling reduce the effect of confounding?

A

Statistical modelling (e.g. multivariable regression analysis) is used to control for more than one confounder at the same time, and allows for the interpretation of the effect of each confounder individually. It is the most commonly used method for dealing with confounding at the analysis stage.

It should be noted that this method should not be used to look for confounders, but instead reduce the effect of confounders you already suspect.

180
Q

How does standardisation reduce the effect of confounding?

A

Standardisation accounts for confounders (generally age and sex) by using a standard reference population to negate the effect of differences in the distribution of confounding factors between study populations.

It should be noted that this method should not be used to look for confounders, but instead reduce the effect of confounders you already suspect.

181
Q

What is Residual confounding?

A

Residual confounding occurs when all confounders have not been adequately adjusted for, either because they have been inaccurately measured, or because they have not been measured (for example, unknown confounders). An example would be socioeconomic status, because it influences multiple health outcomes but is difficult to measure accurately.

182
Q

What is interaction (effect modification)?

A

Interaction occurs when the direction or magnitude of an association between two variables varies according to the level of a third variable (the effect modifier).

For example, aspirin can be used to manage the symptoms of viral illnesses, such as influenza. However, whilst it may be effective in adults, aspirin use in children with viral illnesses is associated with liver dysfunction and brain damage (Reye’s syndrome). In this case, the effect of aspirin on managing viral illnesses is modified by age.

Where interaction exists, calculating an overall estimate of an association may be misleading.

183
Q

How do you adjust for the effect of interaction in your study results?

A

Unlike confounding, interaction is a biological phenomenon and should not be statistically adjusted for.

A common method of dealing with interaction is to analyse and present the associations for each level of the third variable.

Interaction can be confirmed statistically, for example using a chi-squared test to assess for heterogeneity in the stratum-specific estimates. However, such tests are known to have a low power for detecting interaction and a visual inspection of stratum-specific estimates is also recommended.

184
Q

What are descriptive studies?

A

Descriptive studies are designed to be investigative in purpose, starting with no hypothesis but instead highlighting patterns of disease and associated factors.

185
Q

What are the types of descriptive study?

A

Case reports (a report of a single case of an unusual disease or association)
Case series (a description of several similar cases)
Cross-sectional studies
Ecological Studies

186
Q

What are ecological studies?

A

A type of descriptive study.

They examine populations, or groups, as the unit of observation instead of individuals.

Ecological studies are particularly useful to conduct when individual-level data would either be difficult or impossible to collect, such as the effect of air pollution or of legislation.

Examples of the use of ecological studies include:
Correlating population disease rates with factors of interest, such as healthcare use
Demonstrating changes in mortality over time (time series)
Comparing the prevalence of a disease between different regions at a single point in time (geographical studies)

187
Q

What are the advantages and disadvantages of case reports and case studies studies?

A

Advantages:
Cheap
Easy to complete

Disadvantages:
Weaknesses of case reports and case series are that they have no comparison (control) group, they cannot be tested for statistical associations, and they are especially prone to publication bias (especially where case reports/series describe the effectiveness of an intervention).

188
Q

What are the advantages and disadvantages of ecological studies?

A

Advantages
Exposure data is often readily available at the area level.
Differences in exposure between areas may be bigger than at the individual level, and so are more easily examined.
The utilisation of geographical information systems to examine the spatial framework of disease and exposure.

Disadvantages:
Ecological fallacy
Potential systematic differences between areas in recording disease frequency e.g. Coding differences
Potential systematic differences between areas in the measurement of exposures.
Lack of available data on confounding factors.

189
Q

What is the prupose of case reports?

A

Generating hypotheses of possible causes or determinants of disease.
Identifying novel associations

190
Q

What is the purpose of case series?

A

Generating hypotheses of possible causes or determinants of disease.

191
Q

What is the purpose of ecological studies?

A

Generating hypotheses of possible causes or determinants of disease.
Comparing countries or regions
Studying group-level effects (for example, the correlation between death rates from cardiovascular disease and cigarette sales per capita).

192
Q

What is the ecological fallacy?

A

The ecological fallacy is an error in the interpretation of the results of an ecological study, where conclusions are inferred about individuals from the results of aggregate data.

This fallacy occurs as the individual members of a group do not all have the average characteristics of the group as a whole, thus any association observed between variables at the group level do not apply to the individuals.

For example, it has been observed that the number of televisions per capita is negatively associated with the rate of deaths from heart disease. However, it would be an ecological fallacy to infer that people who don’t own televisions die from heart disease.

193
Q

Why does the ecologcial fallacy occur?

A

It is not possible to link exposure with disease in individuals - those with disease may not be the same people in the population who are exposed.
The data used may have originally been collected for other purposes.
Use of average exposure levels may mask more complicated relationships with the disease, such as the J-shaped relationship between alcohol consumption and heart disease.
Inability to control for confounding.

194
Q

What is Small Area Analysis (SAA)?

A

Small-area analysis (SAA) is the examination of data for groups, such as towns, which tend to be more homogenous in character compared with larger populations that are likely to be more diverse.

What counts as a “small area” is not strictly defined, however, they are typically regions for which data such as healthcare usage are readily available, for example, a county, city or postcode area.

195
Q

What are the uses of Small Area Anaylsis?

A

The main use is to assess service utilisation for health services research
Evaluating descriptive statistics
Comparing small areas
Increasing knowledge of an area – E.g. socioeconomic variations across geographic areas
Informing public policy
Supporting decision-making – Recourse allocation

196
Q

Why may healthcare services be utilized differently in different areas?

A

Different levels of illness
Differences in illness behaviour
Diagnostic decisions of physicians or treatment decisions by physicians
Different availability of resources.

197
Q

Where is small area analysis data published?

A

Small-area statistics are published by the ONS as Super Output Areas (SOAs)

198
Q

What are super output areas (SOAs)?

A

SOAs are the way that small area analysis data is published by the ONS in the UK.

SOAs are aggregates of adjacent output areas (OAs), which each have similar social characteristics. There are two categories of SOA, each of a different size. This permits a choice of scale for the publication of data while minimising the risk that local data that could be disclosive.

SOAs provide a basis for comparison across the country because the units are all a similar size.

The ONS has defined 34,378 ‘Lower Layer’ SOAs (LSOAs, typically contain 4 to 6 OAs, with a population of around 1500) in England and Wales and 7,193 larger ‘Middle Layer’ SOAs (MSOAs, built from groups of LSOAs, with an average population of 7200). There are no “Upper Layer” SOAs.

199
Q

What data is typically included in small-area analysis?

A

Censuses and population estimations
Administrative records such as birth and death notifications
Hospital Episode Statistics and other healthcare utilisation data

Perinatal mortality
A&E attendances
Cancer deaths
Attempted suicides
Vaccination uptake

200
Q

What are the stages of small-area analysis?

A

Method:
Identify/define geographic boundaries of the areas of interest
Estimate resources allocated to the population of each area (e.g. number of hospital beds)
Calculate crude or age-adjusted (using indirect standardisation) utilisation rates. (Rates represent events, not persons, and patients repeatedly using services are counted each time).

Anaylsis:
Direct comparisons of rates between the areas of interest, to identify areas of high need
Correlation analyses to establish general relationships between health indicators and social and economic characteristics

201
Q

What are the problems with small-area analysis?

A

Small sample sizes:
Prone to sample not being representative
May lack sufficient data for event analysis
Errors have a greater impact
Particular events are more likely to disrupt trends

Difficulty allocating events to an area (e.g. lack of patient postcode)

Absence of denominator data (e.g. population size)

Populations may differ in their structure and size, making direct comparisons difficult.

Changing geographical boundaries of small areas undermines the consistency of historical data.

Changing case definitions used in healthcare records may change over time, or vary by locality.

Routine data may not be available

Cases may be identifiable in small data sets, threatening patient confidentiality.

202
Q

In which types of studies may data be clustered?

A

Randomised control trials (called clustered RCT’s)
Longitudinal studies (Cohort, Panel, Record linkage)
Deriving subjects for surveys

203
Q

How is data clustered in randomised control studies?

A

Patients are nested within larger clusters, or groups, such as GP practices, hospitals or communities. The intervention is applied at the cluster level, while the outcomes are measured at the patient level.

204
Q

What is a clustered randomised control trial?

A

A randomised control trial where the patients are clustered into larger groups for practical reasons.

Examples:
Randomising clustered family units when assessing a dietary intervention in an RCT (limits the chance of different family members having to have separate diets)
Randmoising clustered GP practices when assessing interventions to tackle smoking (limits requirement of GP practice to offer two different interventions)

205
Q

What are the advantages of clustered data in an epidemiological study?

A

The effects of interventions applied at the cluster level might be greater (e.g. social networks reinforcing health promotion).
Reduces the risk of contamination (e.g. Patients in different treatment arms discuss their respective interventions).
More suitable for interventions where a population or group is the unit of randomisation and intervention
May be cheaper
May be quicker
Can be more practical for certain types of study

206
Q

What are the disadvantages of clustered data in an epidemiological study?

A

More complex design to take account of intra-cluster correlation (ICC)
More complex analysis because there are two levels of inference rather than one - the cluster level and the individual level
Greater sample size is needed to achieve sufficient statistical power
Greater sample size requires more costs/resources/time
Requires necessary skills in design and analysis
May be more complex to assess generalisability – for example are the results applicable to clusters, individuals or both?

207
Q

What is Intra-cluster correlation coefficient (ICC) or (ρ)?

A

A measure of the relatedness, or similarity, of clustered data. It is depicted by the Greek letter rho – ρ.

208
Q

How is Intra-cluster correlation coefficient (ICC) or (ρ) interpreted?

A

Values of ρ range from 0 to 1 in human studies, and as the ICC increases the more individuals within the individual clusters resemble one another.

A very small value for ρ implies that the within-cluster variance is much greater than the between-cluster variance, and a ρ of 0 shows that there is no correlation of responses within a cluster.

If ρ = 1, all responses within a cluster are identical and the effective sample size is reduced to the number of clusters rather than the number of individuals

If ρ = 0, there is no correlation of responses within a cluster, and individuals within and amongst the group are independent with respect to that variable

209
Q

What is the interaction between sample size and Intra-cluster correlation coefficient (ICC) or (ρ)?

A

A higher value of ICC (ρ) indicates that individual clusters are very similar (i.e. the individuals within that each cluster are very similar).

As the ICC increases, the sample size required to detect a significant difference for the variable under investigation increases.

This is because as clusters become more similar, there is a net loss of data. For example, if a trial includes four GP practices, each enrolling 25 patients, there are 100 subjects in the study. However, from a statistical perspective, similarities between subjects in the same cluster effectively reduce the number of participants in the trial.

As the ICC gets larger, there are fewer subjects enrolled “statistically”.

210
Q

How do you decide on the required sample for a clustered randomised control trail?

A

As the Intra-cluster correlation coefficient increases, the sample size required to detect a significant difference for the variable under investigation increases. Thus this needs to first be estimated before a clustered RCT is done.

The ICC can be estimated through a pilot study. And then the ‘design effect’ (DE) can be used to estimate the extent to which the sample size should be inflated to accommodate for the homogeneity in the clustered data:

DE = 1+(n-1)ρ

n = average cluster size
ρ = ICC for the desired outcome

The DE can then be used to calculate the ‘effective sample size’. This is the ‘real’ sample size in a clustered trial, compared with the number of participants actually enrolled in the study. It is calculated using the formulae below:

ESS = (m*k)/DE

m = number of subjects in a cluster
k = number of clusters
DE = Design effect

211
Q

What is the effective sample size of the following randomised control trial:

The trial includes four GP practices, each enrolling 25 patients

ICC (ρ) = 0.017

A

Effective sample size = (m*k)/(1+p(m-1))

ESS = (25*4)/(1+0.017(25-1)

ESS = 100/1.408

ESS = 71

While the study had a population of 100 participants, in reality, due to the similarity between clusters (GP practices), the effective sample size is 71.

212
Q

How does the analysis stage differ in a clustered study from a non-clustered study?

A

Cluster-robust standard errors are a form of standard error that account for the effects of clustering, generating larger values with subsequently wider confidence intervals and more conservative p values.

Regression models can be adapted to account for clustering, using either fixed effects models (where the cluster itself is included as a factor within a standard regression model) or random effects models (which account for the similarities between individuals within clusters in a multilevel model).

213
Q

What is number needed to treat (NNT)?

A

The number of patients that need to be treated in order for one to benefit.

Summarises the results of a clinical trial in a single figure, that is easily understood by both doctors and patients.

Similar measures are reported in vaccination and screening studies (number needed to vaccinate/screen) and their values are derived in a similar fashion to the NNT.

214
Q

How do you calculate the number needed to treat (NNT)?

A

The number needed to treat can be calculated using a 2x2 contingency table. There are also methods available for deriving NNT using odds ratios and relative risk reduction (however these are not included in the purple book)
Outcome No Outcome Total
Exposure. a b. a+b
No Exposure. c d. c+d
Total a+c b+d

Absolute risk reduction (ARR) = Control event rate (CER) - Experimental event rate (ERR)

CER = c/(c+d)
ERR = a/(a+b)

NNT = 1/ARR

When reporting NNTs, the control treatment and intervention should be outlined, as well as the dose and duration of the intervention, the outcome, and the period over which observations were made as NNT is time specific.

215
Q

What is the number needed to treat in the following example?

A study looking into the effects of carvedilol lasted 58 months. All-cause mortality was 34% in the treatment group and 40% in the control group.

A

NNT = 1/Absolute risk reduction (ARR)

ARR = 40%-34% = 0.4-0.36 = 0.06

NNT = 1/0.06

NNT = 16.7 = 17 (round for NNT)

216
Q

How do you interpret the number needed to treat?

A

The higher the NNT, the less effective the treatment, because more people need to receive the treatment for one person to benefit.

The ideal NNT would be 1, where all the patients in the treatment group have benefitted, but no one has in the control arm.

If the drug or intervention is harmful the NNT will be negative. This is sometimes referred to as the ‘Number Needed to Harm’ (NNH). This can also be used to describe adverse effects, for example as a result of the treatment under study.

NNT needs to be interpreted in light of the clinical context. For example, an NNT of between 2 and 5 would normally indicate an effective therapy, such as a pain killer for acute pain. On the other hand, an NNT of 1 might be seen when treating a sensitive bacterial infection with antibiotics, whilst an NNT of 40+ might still be beneficial in other situations where the clinical endpoint is severe, such as using aspirin to prevent a heart attack.

It is also important to distinguish between treatments and preventative (prophylactic) measures. In trials of prophylactic treatments, fewer events will occur in the treatment arm versus the control group, so (Pa – Pc) and the NNT will be negative. The calculated NNT value, without the sign, can still be presented. Alternatively, in the case of preventive measures, the denominator of the formula can be rearranged to provide an NNT with a positive sign, i.e. 1/(Pc – Pa)

217
Q

What are the advantages of the Number needed to treat statistic?

A

Easy to interpret
Sicciently summarising a trial results
Useful to inform decision-making about individual patients and treatment options
Easy to calculate

218
Q

What are the disadvantages of the Number needed to treat statistic?

A

Can be tricky when generalising to populations and when comparing NNTs between studies.
Only based on the most probable value in a normally distributed population (does not take into account an individual patient’s risk).
Clinical interpretation is subject to interpretation (an NNT of 100 might be seen by some doctors as a health benefit, whereas others will consider it only moderate).

219
Q

What are Time-trend studies?

A

A form of longitudinal ecological study that provides a dynamic view of a population’s health status through the routine collection of population-level data.

Observations are recorded for each group at equal time intervals, for example monthly. Examples of measurements include the prevalence of disease, levels of pollution, or mean temperature in a region.

This allows for the examination of trends and changes as well as finding hypotheses on the reasons for changes.

220
Q

What is the use of time trend studies?

A

Patterns of change in an indicator over time – for example whether usage of a service has increased or decreased over time, and if it has, how quickly or slowly the increase or decrease has occurred

Comparing one time period to another time period – for example, evaluating the impact of a smoking cessation programme by comparing smoking rates before and after the event. This is known as an interrupted time series design.

Comparing one geographical area or population to another – for example, comparing changes in rates of cardiovascular deaths between the UK and India.

Making future projections – for example to aid the planning of healthcare services by estimating likely resource requirements

221
Q

What analytical steps are done in time-trend studies?

A

Plot the observations of interest by year (or some other time period deemed appropriate).
Examine the observations in tabular form.
Time series analysis

Other more detailed approaches (detail not required):
Regression analysis (if the trend can be assumed to be linear)
Mann-Kendall test (a non-parametric method which can be used for non-linear trends)

222
Q

What is time series analysis?

A

A particular collection of specialised regression methods that illustrate trends in time series data. These incorporate information from past observations and past errors in those observations into the estimation of predicted values.

There are three types of modelling used to analyse time series data, together they form the ARIMA models:
Autoregressive (AR) models (using previous observations to predict future observations)
Integrated (I) models (adjusting for underlying trends over time)
Moving average (MA) models (using previous prediction errors to predict future observations)

223
Q

What should be included in the presentation of time-series data?

A

Graphical plots displaying the observed data over time
Any statistical methods used to transform the data Average percent change
An interpretation of the trends seen
Moving averages

224
Q

What are moving averages and how are they calculated?

A

Moving averages (or rolling averages) provide long-term trends whilst smoothing out any short-term fluctuations.

To calculate the moving average you simply average the yearly value by the amount of years you want to calculate.

For example for the 3 yearly rolling averages, you would take the year in question, as well as the year before and after and average these results. Then do this for all of the data.

When using moving averages to smooth data, be careful not to average too many years’ worth of data for each calculation (e.g. using 10-year moving averages), as you risk over-smoothing the line and losing potentially important trends.

These can then be tabulated and presented. Plotting a moving average should give smoother results than the original dataset.

225
Q

What is the rolling average in the following example?

Year 2010, Cost 80
Year 201, Cost 100
Year 2012, Cost120
Year 2013, Cost101
Year 2014, Cost 94
Year 2015, Cost 111

A

2010 = Cannot be calculated
2011 = (80+100+120)/3 = 100
2012 = (100+120+101)/3 = 107
2013 = (120+101+94)/3 = 105
2015 = (101+94+111)/3 = 102
2016 = Cannot be calculated

226
Q

What are the risks of using time-trend data?

A

Data on exposure and outcome may be collected in different ways for different populations
Migration of populations between any groups during the study period may dilute any difference between the groups
Even within a single population, there may be underlying changes, such as in age structure, which affect the outcome
Seasonal variation can results in fluctuations which affect the outcome trend (although this can be accounted for during analysis)
Routine data sources may have been collected for other purposes
Ecological studies do not allow us to answer questions about individual risks

227
Q

What is a sample?

A

A subset of the total population which researchers use to draw conclusions about the larger group.

228
Q

What are the advantages and disadvantages of a smaller sample?

A

Advantages:
Lower cost
Lower workload
Easier to get high-quality information.

Disadvantages:
May not have a large enough sample size to detect a true association

229
Q

What are the methods of sampling?

A

Probability Sampling Methods:
Simple Random Sampling
Systematic Sampling
Stratified Sampling
Clustered Sampling

Non-Probability Sampling Methods:
Convenience Sampling
Quota Sampling
Judgement Sampling
Snowball Sampling

230
Q

What is Simple Random Sampling and what are its advantages and disadvantages?

A

A type of Probability Sampling where each individual is chosen entirely by chance and each member of the population has an equal chance, or probability, of being selected. One way of obtaining a random sample is to give each individual in a population a number, and then use a table of random numbers to decide which individuals to include.

Advantages:
Simple

Disadvantages:
You may not select enough individuals with your characteristic of interest, especially if that characteristic is uncommon.
Difficult to define a complete sampling frame (set from which you draw your sample)
Inconvenient to contact participants

231
Q

What is Systematic Sampling and what are its advantages and disadvantages?

A

A type of probability sampling where individuals are selected at regular intervals from the sampling frame. The intervals are chosen to ensure an adequate sample size. If you need a sample size n from a population of size x, you should select every x/nth individual for the sample.

Advantages:
More convenient than simple random sampling
Easy to administer

Disadvantages:
May lead to bias if there is an underlying pattern to the sampling frame

232
Q

What is Stratified Sampling and what are its advantages and disadvantages?

A

A type of probability sampling where the population is divided into subgroups (or strata) who all share a similar characteristic. The study sample is then obtained by taking equal sample sizes from each stratum.

You may choose non-equal sample sizes from each stratum. For example, if there are three hospitals with different numbers of staff, then you may choose the sample numbers from each hospital proportionally to ensure a more realistic estimation of the health outcomes of staff.

It is used when you expect the measurement of interest to vary between the different subgroups, and we want to ensure representation from all the subgroups. For example, in a study of stroke outcomes, we may stratify the population by sex, to ensure equal representation of men and women. If this is done it needs to be accounted for in the analysis stage.

Advantages:
Improves the accuracy and representativeness of the results by reducing sampling bias

Disadvantages:
requires knowledge of the appropriate characteristics of the sampling frame
Can be difficult to decide which characteristic(s) to stratify by.

233
Q

What is Clustered Sampling and what are its advantages and disadvantages?

A

A type of probability sampling where subgroups of the population are used as the sampling unit, rather than individuals, these subgroups are called clusters. Clusters are then randomly selected.

Clusters are usually already defined, for example, individual GP practices or towns could be identified as clusters.

In single-stage cluster sampling, all members of the chosen clusters are then included in the study.

In two-stage cluster sampling, a selection of individuals from each cluster is then randomly selected for inclusion.

Clustering should be taken into account in the analysis.

Advantages:
Efficient

Disadvantages:
Increased risk of bias, if the chosen clusters are not representative of the population

234
Q

What is Convenience Sampling and what are its advantages and disadvantages?

A

A type of non-probability sampling where participants are selected based on availability and willingness to take part.

Advantages:
Useful results can be obtained easily

Disadvantages:
Risk of volunteer bias
Sample may not be representative of other characteristics, such as age or sex.

235
Q

What is Quota Sampling and what are its advantages and disadvantages?

A

A type of non-probability sampling where interviewers are given a quota of subjects of a specified type to attempt to recruit. For example, an interviewer might be told to go out and select 20 adult men, 20 adult women, 10 teenage girls and 10 teenage boys. Ideally, quotas would proportionally represent the characteristics of the underlying population.

Advantages:
Straightforward

Disadvantages:
Risk of volunteer bias
The chosen sample may not be representative of other characteristics that weren’t considered

236
Q

What is Judgement Sampling (aka purposive sampling) and what are its advantages and disadvantages?

A

Also called selective, subjective or purposive sampling. A type of non-probability sampling that relies on the judgement of the researcher when choosing who to ask to participate. Researchers may implicitly thus choose a “representative” sample to suit their needs, or specifically approach individuals with certain characteristics.

Judgement sampling has the advantage of being time-and cost-effective to perform whilst resulting in a range of responses (particularly useful in qualitative research). However, in addition to volunteer bias, it is also and the findings, whilst being potentially broad, will not necessarily be representative.

Advantages:
Time-effective
Cost-effective
Useful for qualitative research
Can result in a wide range of responses

Disadvantages:
Risk of volunteer bias
prone to errors of judgement by the researcher
Findings may not be representative of the underlying population

237
Q

What is Snowball Sampling and what are its advantages and disadvantages?

A

A type of non-probability sampling where existing subjects are asked to nominate further subjects known to them, so the sample increases in size like a rolling snowball.

Advantages:
Useful for hard-to-reach groups
Good when the sampling frame is difficult to identify

Disadvantages:
Risk of volunteer bias
Risk of selection bias as acquaintances of previous participants are more likely to be similar to them.

238
Q

What are the potential sources of bias when sampling a population?

A

Deviating from pre-agreed sampling rules
Omitting people in hard-to-reach groups
Replacing selected individuals
Low response rates
using out-of-date lists as the sample frame

239
Q

What is a survey?

A

A study design that collects the same data on each case in the sample, producing a standard set of data for each subject that can be analysed statistically to look for patterns and relationships between the variables assessed

240
Q

What is the role of surveys?

A

A large representative survey sample can generate useful generalisable data which can generate hypotheses for further research.

Surveys cannot answer questions about causation.

241
Q

What are the steps in designing and conducting a survey?

A
  1. Establish the goals of the survey i.e. What do you want to learn

2, Determine your sample including who is in your target population and how many people you will interview.

  1. Choose interview methodology:
    Personal face-to-face interviews
    Telephone surveys
    Mail surveys
    Internet surveys
  2. Design your survey
  3. Pilot the questionnaire with the target group
  4. Conduct interviews and enter data
  5. Analyze the data and write the report
242
Q

What is the purpose of piloting a survey before releasing it?

A

Reveals unanticipated problems with layout, question-wording, instructions, etc.

Tests, whether the questionnaire can be administered in a reasonable amount of time.

Helps to rephrase or re-structure questions if the range of responses is inadequate.

Determines whether the questionnaire is culturally acceptable to study participants

Determines whether it generates reliable and consistent answers.

243
Q

What documentation may be required for a survey?

A

The questionnaire (or document used to record responses).

Other possible documents:
Manuals for both the interviewers and their supervisors
Documentation for training
Guidelines for sampling

244
Q

How do you design a good survey?

A

KISS - Keep it short and simple

Ensure that it measures what it claims to measure and is valid and reliable.

Overall Design:
Language should be clear and simple, with short sentences.
Abbreviations and jargon should be avoided
Language should be appropriate for the target audience
Professional production methods will also convey the impression that the questionnaire is important
Start with non-threatening, interesting items, and include the most important questions in the first half
Questions should be grouped into coherent categories
It should be easy to navigate
There should be suitable space for responses
The question format should be varied to prevent participants producing repetitive answers as their attention wanes (known as habituation).

245
Q

What are the two main types of questions that may be included in a survey and what are Pro/cons of each?

A

There are two main types of questions:
Closed questions (‘Yes/No’ or a multiple choice format).
Open questions (Allow the respondent to answer freely).

Where appropriate, closed questions should also include an open-ended question following ‘Other’ or ‘Don’t Know’ responses to overcome the bias of unexpected answers not being included.

Closed Questions Pro’s/Cons:
Provides a set of standard responses that enable researchers to produce aggregated data quickly.
May not include all potential responses (biased).

Open Question Pro/Cons:
Richer responses
More complex analysis required

246
Q

How do you decide on the topics you will ask about in a survey?

A

May be done purely theoretically, for example, by covering the issues you think are important
Using a focus group or interviews to determine what is important to the target group
Identifying important variables via a literature search

247
Q

In what ways can you maximise the response rate to a survey?

A

Notifying participants in advance with a letter of introduction outlining the purpose of the study
Using a clear and simple layout
Using clear and concise questions which avoid the use of technical jargon, and long, leading or negative questions
Inclusion of a stamped addressed envelope if conducting a postal survey, or collection of questionnaires if feasible
Ensuring anonymity where possible, especially if the questionnaire includes sensitive items
Follow-up of non-responders by telephone or letter
Rewards for completing the questionnaire, such as a free gift or donation to charity

248
Q

What are the causes of participant inaccuracy in surveys?

A

Excess mental demands – for example, difficulty understanding the question, difficulty in recalling moods and events over time

Biases in answering the question – for example, social desirability (seeking to present oneself in the best light), recall bias, or end avoidance (respondents choosing not to give extreme answers on a continuous scale).

249
Q

What is disease prognosis?

A

A prediction of the course of disease following its onset, it specifically refers to the possible outcomes of a disease (e.g. death, chance of recovery, recurrence) and the frequency with which these outcomes can be expected to occur.

250
Q

What are prognostic factors?

A

Characteristics of a particular patient can be used to more accurately predict that patient’s eventual outcome.

Common examples include:
Demographic (e.g. age)
Behavioural (e.g. alcohol consumption, smoking)
Disease-specific (e.g. tumour stage)
Co-morbid (e.g. other conditions accompanying the disease in question)

Prognostic factors need not necessarily cause the outcomes, just be associated with them strongly enough to predict their development

251
Q

What is the difference between risk factors and prognostic factors?

A

Prognostic factors and Risk factors occur at different stages on the disease spectrum: risk factors are present before the development of a disease, whereas prognostic factors may either have been present before the onset (e.g. sex, smoking behaviour) of the disease under investigation or have developed afterwards (e.g. tumour size, high white cell count).
Study patients are different – in prognostic studies, they have already developed the disease of interest
Risk and prognosis describe different outcomes – the onset of disease versus a range of disease consequences
Variables associated with an increased risk of developing a disease are not necessarily the same as those that indicate a worse prognosis or outcome.

252
Q

What is a prognostic study and what are their features?

A

A study where patients with a particular illness are identified, followed forward in time, and their outcomes measured. Conditions associated with the outcome are identified; these are known as prognostic factors. The best design for a prognostic study is a cohort study as it is unethical to randomise patients to different prognostic factors.

Features:
Begin at a defined point of time in the disease course, follow up patients for an adequate period of time, and measure all relevant outcomes.
The study population should include all those with a disease in a defined population, for example all those on a disease register
Patients should all be followed up from the same defined point in the disease course to ensure a precise estimate of prognosis
Patients must be followed up for long enough so that most important outcomes have occurred
Prognosis estimates should include all aspects of a disease that are important to patients, including pain and disability, not just death or recovery.

253
Q

How do you appraise a prognostic study?

A

Laupacis and colleagues provide a helpful guide to reviewing prognostic studies, it includes:

Was there a representative and well-defined sample of patients at a similar point in the course of the disease?
Was follow-up sufficiently long and complete?
Were objective and unbiased outcome criteria used?
How large is the likelihood of the outcome events occurring in a specified period of time?
Were the study patients similar to my own?
Are the results useful for reassuring or counselling patients?

254
Q

What are the advantages and disadvantages of prognostic studies?

A

Advantages:
Can facilitate clinical decision-making
Facilitate patient education and counselling
May enable subgroups of patients to be identified who are at particular risk of specific disease outcomes (improving future study designs and analysis of clinical trials through risk stratification).

Disadvantages:
There is a large variation in the quality of prognostic studies published
The results may not be generalisable to local settings

255
Q

What is the name of the set of ethical principles that guide medical professionals and researchers when performing medical research involving humans, and who made it?

A

The Declaration of Helsinki
The world medical association

256
Q

What are the elements of the Declaration of Helsinki?

A

Safeguarding research subjects
Informed consent
Minimising risk
Adhering to an approved research plan/protocol.

257
Q

What are the ethical principles you should consider when performing epidemiological studies?

A

The Declaration of Helsinki:
Safeguarding research subjects
Informed consent
Minimising risk
Adhering to an approved research plan/protocol

Do no harm
Conflicts of interest
Scientific misconduct
Suitable Publication

258
Q

What is informed consent (in relation to research)?

A

Informed consent is a process by which the risks, benefits, and expectations of a research project are disclosed to a participant in order for them to make an informed decision about whether to participate.

Informed consent includes three key components:
Information - Did they have adequate information regarding risks, burdens and benefits to make an informed choice?
Understanding - Did they understand enough to make a reasoned choice based on that information (capacity)?
Voluntariness - Was the agreement a voluntary decision or on the part of a capable person, without external?

Where informed consent has been obtained, it must be clearly documented.

Extras:
Whilst some reimbursements such as travel costs may be reasonable, paying participants to take part may not be.
For some epidemiological studies, particularly case-control studies and historical cohort studies, non-disclosure of the full aims of the study may be permissible, because full disclosure of the study hypothesis could bias the investigation.

259
Q

What is the principle of “Do no harm”?

A

A key ethical principle of medicine and epidemiology is the moral obligation to cause no harm to participants (non-malfeasance), whether physical or psychological.

Although the risk in an epidemiological investigation is usually minimal, one of the main risks is data breaches.

260
Q

How should data be stored during epidemiological studies?

A

Data should only be stored with personal identifiers if absolutely necessary
Identifiable information should never be stored on computers outside research establishments
Files containing personal identifiers (name, security numbers, addresses, telephone numbers, etc) should be stored in locked cabinets.

261
Q

What is a conflict of interest?

A

A conflict of interest is a situation in which a researcher has, or appears to have, a private or personal interest, for example, a financial investment, sufficient to influence the objective exercise of their professional judgement.

Researchers must disclose actual, apparent or potential conflicts of interest to their colleagues, the ethics committee and subsequently to a journal publishing their work. All sponsorship of research should also be publicly acknowledged.

262
Q

How does publishing of research relate to ethical principles in epidemiological studies?

A

Research results should be published in an appropriate journal without undue delay.
As a general rule, research findings should be subject to independent peer review prior to publication or submission to the media.
The non-publication of research with “negative” findings (results which fail to reject a study’s null hypothesis) is also seen as unethical.

263
Q

What is scientific misconduct?

A

Any large-scale study provides ample opportunity for the data to be manipulated. Scientific misconduct is defined as any deviation from an interpretation not reached in good faith and without the aim of objectivity.

External pressures to publish and to obtain research funding are strong risk factors for scientific misconduct.

264
Q

What is the Basic reproduction number (R0)?

A

The basic reproduction number (R0) is used to measure the transmission potential of a disease. It is the average number of secondary infections produced by a typical case of an infection in a population where everyone is susceptible.

For example, if the R0 for measles in a population is 15, then we would expect each new case of measles to produce 15 new secondary cases (assuming everyone around the case was susceptible).

R0 excludes new cases produced by the secondary cases.

265
Q

What factors affect the basic reproduction number (R0)?

A

The rate of contacts in the host population
The probability of infection being transmitted during contact
The duration of infectiousness.

266
Q

What is the minimum value of the basic reproduction number (R0) required for an epidemic?

A

Greater than 1 so that the number of cases is increasing

267
Q

What is effective reproductive number (R)?

A

In most cases, not all contacts will be susceptible to every infection. Some contacts will be immune, for example due to prior infection which has conferred life-long immunity, or as a result of previous immunisation.

Therefore, not all contacts will become infected and the average number of secondary cases per infectious case will be lower than the basic reproduction number.

In this case, the effective reproductive rate (R) is used instead of the minimum value of the basic reproduction number (R0).

The effective reproductive number (R) is the average number of secondary cases per infectious case in a population made up of both susceptible and non-susceptible hosts.

The effective reproduction number can be estimated by the product of the basic reproductive number and the fraction of the host population that is susceptible (x). So:

R = R0x

For example, if R0 for influenza is 12 in a population where half of the population is immune, the effective reproductive number for influenza is 12 x 0.5 = 6. Under these circumstances, a single case of influenza would produce an average of 6 new secondary cases.1

To successfully eliminate a disease from a population, R needs to be less than 1.\

268
Q

How do you interpret effective reproductive number (R)?

A

R>1, the number of cases will increase, such as at the start of an epidemic

R=1, the disease is endemic

R<1, there will be a decline in the number of cases.

269
Q

How do you calculate effective reproductive number (R)?

A

The effective reproduction number can be estimated by the product of the basic reproductive number and the fraction of the host population that is susceptible (x).

R = R0*x

For example, if R0 for influenza is 12 in a population where half of the population is immune, the effective reproductive number for influenza is 12 x 0.5 = 6. Under these circumstances, a single case of influenza would produce an average of 6 new secondary cases.

270
Q

What is the effective reproductive number (R) in the examples below and how do you interpret this result?

R0 for influenza is 12 in a population where half of the population is immune.

A

R=R0*x

R=12*0.5=6

The effective reproductive number for influenza is 6 Under these circumstances, a single case of influenza would produce an average of 6 new secondary cases.

271
Q

What is herd immunity?

A

Herd immunity occurs when a significant proportion of the population (or the herd) have been vaccinated (or are immune by some other mechanism), resulting in protection for susceptible (e.g. unvaccinated) individuals.

The larger the number of people who are immune in a population, the lower the likelihood that a susceptible person will come into contact with the infection. It is more difficult for diseases to spread between individuals if large numbers are already immune as the chain of infection is broken.

272
Q

What is the herd immunity threshold?

A

The proportion of a population that need to be immune in order for an infectious disease to become stable in that community.

It is also the proportion of immune people in a population required for R to be less than or equal to 1.

273
Q

How do you calculate herd immunity threshold?

A

HIT = (R0-1)/R0 or 1-(1/R0)

274
Q

What is an epidemic?

A

An increase in the frequency of occurrence of a disease in a population above its baseline, or expected level, in a given time period.

The number of cases and time period are often unspecified.

For certain diseases, the term is defined quantitatively and a threshold is selected above which the term ‘epidemic’ is applied. For example, The Royal College of General Practitioners (RCGP) has defined the baseline threshold for ‘normal seasonal activity of influenza’ in as 30 to 200 GP consultations for influenza-like illness per week per 100,000 population. The epidemic threshold would be reached if the number of consultations surpassed 200 per week per 100,000.

275
Q

What is Critical Community Size (CCS)?

A

The total population size needed to sustain an outbreak once it has appeared.

276
Q

What is the Outbreak Threshold?

A

The number of infected individuals that are needed to ensure that an outbreak is unlikely to go extinct without intervention.

277
Q

What is an epidemic curve?

A

A graph that illustrates the distribution of the onset of new cases of an infectious disease in relation to the onset of illness.

The time interval for the onset of illness used will be determined by the incubation period

278
Q

What are the uses of epidemic curves?

A

Determine the type of epidemic (continuous source, point source, propagated)
Determine the difference between the maximum and minimum incubation period
Estimate the likely time of exposure, and thus help focus investigation on a particular time period
Determine the incubation period in cases where the time of exposure is known
Identify outliers

279
Q

What is an index case?

A

The index case is the term given to the first recognised case, or cases, in an outbreak.

This index case may not actually be the primary case (the original case of an outbreak), and this may in fact be identified later on in the investigation.

280
Q

What does the term primary case mean in relation to an epidemic or outbreak?

A

The original case of an outbreak is labelled as the primary case.

Secondary cases contract the infection from primary cases, and tertiary cases contracted theirs from secondary cases, and so on.

281
Q

What is the generation time?

A

The duration from the onset of infectiousness in the primary case to the onset of infectiousness in a secondary case (infected by the primary case).

282
Q

What is Exception Reporting in relation to disease outbreaks?

A

Infectious disease surveillance ensures that the frequency of certain diseases or symptoms are monitored. If there is an abrupt increase in the frequency of a particular disease, outside of predefined limits, it will be flagged as an “exception” and thus functions as an early indicator that further investigation is required.

283
Q

What is a significant cluster in relation to disease outbreaks?

A

A cluster, or significant cluster, is an aggregation of cases related in time or place that is suspected to be greater than the number expected (although the “expected” number may not be known).

The term can relate to both communicable and non-communicable diseases.

284
Q

How are significant clusters identified in relation to disease outbreaks?

A

Significant clusters can be identified using spot maps (where each case is represented on a map by a coloured dot), although such maps may show apparent “clusters” in areas that are densely populated (and thus would have a higher number of expected cases).

Alternatively, maps that colour areas in different shades depending on the rate of disease in each area can be used, although if the defined areas are too large it will mask real clusters.

285
Q

What is a systematic review?

A

A review of a clearly formulated question that uses systematic and explicit methods to identify, select, and critically appraise relevant research and to collect and analyse data from studies that are included in the review

286
Q

What are the steps in a systematic review?

A

Defining an appropriate clinical question
Searching the literature
Assessing the studies for eligibility, quality and findings
Combining the results to provide a ‘bottom line’
Placing the findings in context

287
Q

What are the possible data sources you should use in a systematic review?

A

Medline database
Cochrane controlled clinical trials register
Other medical and paramedical databases
Foreign language literature
Grey literature (academic theses, internal reports, non-peer reviewed journals, pharmaceutical industry files)
References (and references of eligible references, etc.) listed in primary sources
Other unpublished sources known to experts in the field (seek by personal communication)
Raw data from published trials (seek by personal communication)

288
Q

What are the advantages of a systematic review?

A

Provides a summary of multiple studies
Can improve understanding of inconsistencies
Limited bias in identifying and rejecting studies
Provide a more precise and reliable estimate of effect.
Can establish generalisability of multiple findings
Clinical and methodological heterogeneity can be identified and new hypotheses generated about specific subgroups
Useful for evidence-based decision making
Helps define limits of what is known and unknown and
Helps to formulate hypotheses for further investigation

289
Q

What are the disadvantages of a systematic review?

A

Results may not apply to an individual patient
A systematic review may be done badly
Inappropriate aggregation of studies that differ in terms of intervention used or patients included can lead to the drowning of important effects
Findings may not be in harmony with the findings from large-scale clinical trials

290
Q

What is a meta analysis?

A

A statistical technique used to combine and summarise the results of several independent studies that addressed the same hypothesis or clinical question in the same way.

291
Q

What is the difference between a meta-analysis and a systematic review?

A

A systematic review is a study type - Looks to synthesise multiple pieces of research, whereas a meta analysis is a statistical method of synthesising data.

A systematic review often involves performing a meta-analysis.

292
Q

What results should be published in a meta-analysis?

A

The numerical data to provide a single estimate of effect
Inclusion criteria
Sample size
Baseline patient characteristics
Withdrawal rate
Results of primary and secondary endpoints of all the studies included

293
Q

How are meta-analysis performed?

A

The overall effect of an intervention is calculated using weighted averages of the results from multiple trials.

The weighting given to individual studies is based on the inverse variance of the effect size, which itself is largely a function of the sample size. So larger studies tend to result in a smaller variance, and thus contribute more to the final meta-analysis than smaller studies with a larger variance.

There are two broad types of meta-analysis models used: fixed effects and random effects.

Fixed effects meta-analyses are used when each of the included studies is thought to be clinically and methodologically similar (i.e. they are relatively homogenous and are thus each measuring the same – or fixed – effect).

Random-effects meta-analyses are used where there is heterogeneity between included studies, and these are more conservative – giving wider confidence intervals for the final pooled estimate and larger p values.

294
Q

How are the results of a meta-analysis presented?

A

The results of a meta-analysis are plotted on a forest plot. These show the effect estimates (such as an odds ratio or relative risk) from each individual study as a shaded square, where the size of the square is proportional to its weighting, along with its confidence interval.

The pooled estimate is given at the bottom as a diamond, where the middle of the diamond represents the pooled effect size and the edges delineate the pooled confidence interval.

295
Q

What are the advantages of meta-analysis?

A

Can provide an objective evaluation of available evidence.
Provide an objective appraisal of the evidence
Provide a more precise estimate of a treatment effect
May explain heterogeneity between results

296
Q

What are the disadvantages of meta-analysis?

A

Poorly conducted meta-analyses may be biased due to the exclusion of relevant studies or inclusion of inadequate studies.
May be subject to publication bias
Bias may be introduced if all relevant studies are not included.
Study heterogeneity may limit the generalisability of the results of a meta-analysis.

297
Q

What is a bibliographic database?

A

A repository of bibliographic or publication records. It provides an index of journal articles from multiple journals, and includes citations, abstracts and often a link to the full text. Databases are available online, so they can be updated regularly and easily accessed

Example - Medline Database

298
Q

What is the Medline database?

A

Medline is perhaps the best known bibliographic database, and can be accessed free of charge via several online portals including PubMed. It is compiled by the National Library of Medicine of the United States and in 1997 was thought to have included around 30-40% of the 10 million biomedical articles that had been published.

299
Q

What is Embase?

A

Embase, published by Elsevier, is another biomedical database . Embase is more comprehensive on pharmacological literature and alternative therapies than Medline.

300
Q

Give examples of common bibliographical databases?

A

Medline

Embase

CINAHL (Cumulative Index of Nursing and Allied Health Literature) – indexes nursing and allied health journals

Cochrane Library – includes Cochrane reviews, and Cochrane’s central register of controlled trials (CENTRAL), as well as health technology assessments and economic evaluations.

Google Scholar – as well as journals and conferences papers, this includes books, dissertations, technical reports and patents

PsychINFO – indexes psychological, social and behavioural science articles from the 1880s onwards

Scopus – includes peer-reviewed journals in the scientific, technical, medical and social sciences

Web of Science – includes coverage of the sciences, social sciences, arts and humanities

301
Q

What are the limitations of electronic bibliographical databases?

A

Databases may not contain the most recent references
Search results from bibliographic databases depend on the search strategy used and the quality of the indexing.
Obtaining a comprehensive selection of references can involve searching several databases because their coverage varies and no single database accesses all available literature
Most databases only include published articles; it is necessary to search separately for grey literature
There is often a bias towards citations written in English

302
Q

What is grey literature?

A

Literature that is produced by governments, academics, business and industry but which is not controlled by commercial publishers. A “commercial publisher” has since been specified as one who’s primary activity is publishing.

Grey literature has also been broadly defined to include everything except peer-reviewed books and journals accepted by Medline. It has not been published in a conventional way, and can be difficult to identify and obtain through the usual routes, and for this reason it is known as ‘grey literature.’

303
Q

What are the challenges in finding grey literature?

A

Comes from a wide range of material, including government publications, reports, statistical publications, newsletters, fact sheets, working papers, technical reports, conference proceedings, theses, policy documents, protocols and bibliographies.
Material is disseminated quickly, often in limited numbers, and seldom undergoes any formal publication process.
Basic information such as author, publication date or publishing body may not be easily discerned, making it difficult to locate and then cite documents.
Low print runs may also make it difficult to locate (less of an issue with internet publishing)
Government or organisational reports, for example, are seldom linked from websites indefinitely (“link rot”).
The copyright status of potentially usable grey literature may prevent digital archiving and access.

304
Q

Which organisations produce grey literature?

A

Government health agencies, such as the Centers for Disease Control and Prevention (CDC) and the National Institutes for Health (NIH) in the United States, and the UK Department of Health.
Universities and other research centres
International agencies, e.g. World Health Organization (WHO) and UNAIDS.
Non-profit organisations, e.g. the Nuffield Trust, an independent UK health charity
Health institutes, e.g. the National Institute for Health and Care Excellence (NICE)
Think tanks, e.g. The King’s Fund library database

305
Q

Which internet sources may help you to find grey literature?

A

OpenGrey – a database of grey literature in Europe
GreyLit – The Grey Literature Report database, which includes relevant literature in health and science policy, and public health
GreyNet – an independent organisation promoting access to grey literature, which provides links to sources of grey literature
Google Scholar – includes citations from the grey literature as well as those from academic publishers

306
Q

What is publication bias?

A

A bias where studies with positive results are more likely to be published. They are also more likely to be published rapidly, in English language journals, and be cited more by other authors

307
Q

Why does publication bias occur?

A

Scientists are less inclined to submit negative results for publication
The attitude amongst journal editors that positive results make better articles.

308
Q

How can the effects of publication bias be minimised in a systematic review?

A

Thorough literature searching, including the inclusion of the grey literature in systematic reviews.
Attempt to quantify how big a problem publication bias is in the field of your systematic review.

309
Q

How can the level of publication bias be ascertained?

A

Construct a funnel plot

310
Q

What is evidence based medicine?

A

The process of turning clinical problems into questions and then systematically locating, appraising, and using contemporaneous research findings as the basis for clinical decisions.

311
Q

Which type of research should be used in evidence-based medicine?

A

EBM is not restricted to randomised controlled trials and meta-analyses, but should involve appraising the best evidence available with which to answer clinical questions

312
Q

What are the steps in evidence-based medicine?

A

Formulating answerable clinical questions – this may relate to diagnosis, prognosis, treatment, iatrogenic harm, quality of care, or health economics
Systematic retrieval of the best evidence available.
Critically appraising the evidence (determining the validity and applicability).
Applying the evidence (directly in-patient care, or in the development of protocols and guidelines).
Evaluating performance.

313
Q

What are the advantages of evidence-based medicine?

A

Allows clinicians and patients access to the most recent clinical knowledge
Can be learnt by people from different backgrounds and at any stage in their careers.
Has the potential to improve continuity and uniformity of care through the development of common approaches and guidelines
Can help providers make better use of limited resources by enabling them to evaluate clinical- and cost-effectiveness of treatments and services

314
Q

What are the disadvantages of evidence-based medicine?

A

Takes time to both learn and practice
Establishing the infrastructure for practicing EBM costs money, for example buying and maintaining suitable computer systems
Exposes gaps in the evidence. However, this can also be helpful in generating local and national research projects
Electronic databases are not comprehensive and are not always well indexed
Can be adversely affected by publication bias, or by a lack of evidence
Can be confused to be a drive towards formulaic medicine. A clinician should listen to the patient, use their own clinical judgement, and be mindful of the best available evidence, so that the optimum management plan is identified for that patient.

315
Q

What is the hierarchy of evidence?

A

The hierarchy indicates the relative weight that can be attributed to a particular study design. Generally, the higher up a methodology is ranked, the more robust it is assumed to be.

At the top end lies the meta-analysis – synthesising the results of a number of similar trials to produce a result of higher statistical power.

At the other end of the spectrum lie individual case reports, thought to provide the weakest level of evidence.

316
Q

What is the generally accepted order of study types in the evidence hierarchy?

A

Systematic reviews and meta-analyses
Randomised controlled trials
Cohort studies
Case-control studies
Cross-sectional surveys
Case series and case reports

317
Q

What are the disadvantages of the hierarchy of evidence?

A

Techniques lower down the ranking are not always superfluous. For example, the link between smoking and lung cancer was initially discovered via case-control studies.
Although randomised control trials (RCTs) are considered more robust, it would in many cases be unethical to perform an RCT.
The hierarchy is also not absolute. A well-conducted observational study may provide more compelling evidence about a treatment than a poorly conducted RCT.
The hierarchy focuses largely on quantitative methodologies.

318
Q

What is the alternative to the evidence hierarchy?

A

The GRADE system

319
Q

What is the GRADE system for analysing evidence and how does it work?

A

A method of assessing the importance of a piece of research for evidence-based medicine.

It classifies the quality of evidence not only based on the study design, but also the potential limitations and, conversely, the positive effects found.

For example, an observational study would start off as being defined as low-quality evidence. However, they can be downgraded to “very low” quality if there are clear limitations in the study design, or can be upgraded to “moderate” or “high” quality if they show a large magnitude of effect or a dose-response gradient.

320
Q

What is Cochrane?

A

An international and independent non-profit organisation established in 1993 aimed at providing up-to-date, accurate information about the effects of healthcare available worldwide. Cochrane produces and disseminates systematic reviews of healthcare interventions and diagnostic tests, and promotes the search for evidence in the form of clinical trials and other interventional studies

321
Q

What is the organisational structure of the Cochrane organisation?

A

There are 14 Cochrane Centres and 19 Regional Branches worldwide.

Those who prepare the reviews are mostly healthcare professionals and researchers who volunteer to work in one of the 53 Cochrane Review Groups, each of which covers a different subject area within healthcare, such as vascular disease (Cochrane Vascular) and inflammatory bowel disease (Cochrane IBD).

Each Review Group has an editorial team overseeing the preparation and maintenance of the reviews, as well as application of the rigorous quality standards for which Cochrane Reviews have become known.

Cochrane also has 16 Methods Groups, each covering different aspects of review methodology. These include the Adverse Effects Methods Group and the Bias Methods Group.

Eleven Cochrane Field Groups are responsible for the dissemination of Cochrane Reviews, and these include contributions from clinicians, academics, consumers and students.

One of these Field Groups – the Cochrane Consumer Network – helps incorporate patient perspectives into the review process and provides jargon-free “plain language summaries” of Cochrane Reviews.

322
Q

What are Cochrane Reviews?

A

The primary output of Cochrane is the Cochrane Database of Systematic Reviews, which is contained within the Cochrane Library.

Cochrane Reviews are systematic assessments of evidence of the effects of healthcare interventions and diagnostic tests, intended to help people to make informed decisions about healthcare based on the best available research evidence.

The reviews seek to investigate the effects of interventions for prevention, treatment and rehabilitation in healthcare settings.

Most are based on randomised controlled trials, but other types of evidence may also be taken into account, if appropriate.

323
Q

What is the Cochrane library?

A

The Cochrane Library consists of six healthcare databases, containing Cochrane reviews, systematic reviews and other clinical trials.

A further 7th database also provides information about Cochrane groups.

The databases within the Cochrane Library include:
Cochrane Database of Systematic Reviews (CDSR)
The Cochrane Central Register of Controlled Trials (CENTRAL)
Health Technology Assessment Database (HTA)a
NHS Economic Evaluation Database (EED)
Database of Abstracts of Reviews of Effectiveness (DARE)
Cochrane Methodology Register (CMR)
About The Cochrane Collaboration

324
Q

What challenges does Cochrane face?

A

Relies on volunteers to produce the reviews
It is difficult to ensure a uniformly high standard of work
Difficulty in disseminating the results of reviews and convincing healthcare professionals
The Cochrane Library is not freely available, and the cost limits access.

325
Q

What is Genetic epidemiology?

A

Epidemiological studies focussing on familial, and in particular genetic, determinants of disease and the joint effects of genetic and non-genetic determinants.

326
Q

What is Association analyses in genetic epidemiology?

A

Studies that seek to prove that across a study population, a particular genetic exposure is consistently associated with an observed disease.

327
Q

What are twin studies?

A

Twin studies were one of the earliest forms of genetic studies, they involve comparing both monozygotic (identical) twins and dizygotic (non-identical) twins to estimate the relative contributions of genes and the environment to specific traits.

Monozygotic twins share the same genetic material whereas dizygotic twins, like other siblings, have only 50% of their genes in common.

If identical twins are more likely to develop an outcome of interest than non-identical twins, it suggests that genes contribute to the outcome.

Monozygotic twins serve as excellent subjects for controlled experiments because they share prenatal environments and those reared together also share common family, social, and cultural environments.

328
Q

What are the limitations of twin studies?

A

Have to use twins
Have the potential to over- or underestimate the role of genetics, because of the challenges of quantifying environmental influences.

329
Q

What are linkage studies?

A

Studies which aim to identify broad genomic regions that might contain a disease gene using the two concepts:
Linkage
Linkage disequilibrium

These two principles allow two major types of linkage analysis to occur:
Parametric linkage analysis
Model-free (non-parametric) linkage analysis

330
Q

What is linkage in genetics?

A

Two genetic loci are linked if they are transmitted together from parent to offspring more often than would be expected under independent inheritance.

The geographically closer two genes are, the less likely it is that a recombination (splitting during replication) event will occur between them, thus the closer they are the more “linked” they are.

331
Q

What is linkage disequilibrium in genetics?

A

Two genetic loci are in linkage disequilibrium if, across the population as a whole, they are found together on the same haplotype (the group of genes inherited from a single parent) more often than expected.

In general, two loci in linkage disequilibrium will also be linked, but the reverse is not necessarily true.

332
Q

What is recombination?

A

The rearrangement of genetic material produced by the crossing over and rejoining of chromosomal segments during Meiosis.

333
Q

What is Parametric linkage analysis?

A

The analysis of how genetic loci co-segregate in pedigrees or family units.

The main output of these studies is the recombination fraction (the probability of recombination between two loci at meiosis). By genotyping genetic markers and studying their segregation through pedigrees, it is possible to infer their position relative to each other on the genome. This can then be used to map genetic markers or disease loci.

Linkage is usually reported using a LOD (logarithm of the odds) score, which takes into account the recombination fraction and chromosomal positions. Large positive LOD scores are evidence for linkage and negative scores are evidence against it.

334
Q

What is Model-free (non-parametric) linkage analysis?

A

Link analysis used for multifactorial diseases, where several genes (and environmental factors) might contribute to disease risk and there is no disease model available.

These use various methods to test whether identical by descent (IBD) sharing at a locus is greater than expected under the null hypothesis of no linkage.

The rationale is that, between affected relatives, excess sharing of haplotypes that are identical by descent (IBD) in the region of a disease-causing gene would be expected, irrespective of the mode of inheritance.

Linkage is usually reported using a LOD (logarithm of the odds) score, which takes into account the recombination fraction and chromosomal positions. Large positive LOD scores are evidence for linkage and negative scores are evidence against it.

335
Q

What are Genetic association studies?

A

Studies that aim to detect associations between one or more genetic polymorphisms and a trait, for example a disease. Association differs from linkage in that the same allele (or alleles) is associated with the trait in a similar manner across the whole population, while linkage allows different alleles to be associated with the trait in different families.

Familiar epidemiological study designs such as case-control or cohort designs are often used for genetic association studies and the data are analysed much the same way. Risk factors or exposures such as smoking are replaced by the presence or absence of a particular genetic polymorphism.

For example, using a case-control study design, where disease cases and controls are compared for the proportions of each which have a certain polymorphism under investigation.

In genome-wide association studies (GWAS), a large number (>300,000) of genetic variants are simultaneously compared between groups, using a hypothesis-free approach, to look for significant associations.4

336
Q

Why might there be an association between a polymorphism and a trait in a population?

A

Direct association – the polymorphism has a causal role
Indirect association – the polymorphism has no causal role but is associated with a nearby causal variant
Confounded association – the association is due to some underlying stratification or admixture of the population, requiring further investigation

337
Q

What are Mendelian randomisation studies?

A

If a genetic variant has an effect on a modifiable risk factor, which itself alters disease risk, then it follows that the genetic variant should also be related to disease risk.

Where this is the case, the genetic variant can be used as a proxy for the risk factor, and there is thus no need to measure the risk factor.

Natural experiments can thus be conducted using an individual’s genetics to assign them to risk groups (e.g. those with, or without a specific genetic variant), and measuring the resulting outcome. This is known as Mendelian randomization.

The advantage of this approach is that, unlike modifiable risk factors, an individual’s genetics are not affected by potential confounders – so gene variant-disease associations are more likely to be causative.

338
Q

How do you approach assessing the quality of a genetic association study?

A

Questions to ask:
What are we hoping for from an association study?
How good a candidate is the gene in question?
How strong is the case for the variants that have been typed?
How appropriate are the samples typed?
Is the study size large enough?
How good is the genotyping?
How appropriate is the analysis?
How appropriate is the interpretation?

339
Q

What are the common problem with genetic association studies?

A
340
Q

What is the difference between risk (cumulative incidence) and incidence (or rate/incidence rate)

A

risk =

probability of an accident X losses per accident

Thus similar risk may be obtained by high impact but low probability or by low impact and high probability of occurring. By this definition, risks are assessed by rating their impact and their probability to yield a product risk score.