5 Epidemiology and biostatistics Flashcards

1
Q

Explain what a case-control study is, and what its goal is

A

Case–control studies are analytical epidemiological studies whose aim is to investigate the association between disease and suspected causes and are usually cross-sectional or retrospective in nature

Need to firstly accurately identify a “case”

In case–control studies, people with an outcome (an infection or a disease) are identified and their medical and social history examined retrospectively in an attempt to identify exposure to potential infectious agent or risk factors. A matched control group free from the disease or infection is also identified and data collected from them
in an identical fashion. The two sets of data are compared to determine whether the disease group was exposed in significantly higher numbers to the suspected risk factors than the control group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are disadvantages of case-control studies?

A

◆ It is not possible to calculate the true incidence and relative risk. The results should
be expressed as odds ratios.
◆ The study design inevitably means that data are collected retrospectively and hence
the information may not be available or may be of poor quality.
◆ If rare diagnosis, need sufficiently large number of study subjects, in order to detect an association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are advantages of case-control studies?

A

◆ These studies are relatively quick and cheap to perform.
◆ Case–control studies are useful for investigating rare diseases.
◆ Case–control studies can be used to evaluate interventions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does a case-control study differ from a cohort study?

A

A case–control study - subjects are enrolled into a case–control study based on whether or not they have a disease.

In a cohort study, subjects are included in the study based on their exposure and are then followed for the development of disease.

Case–control study is the method most commonly used to investigate outbreaks because it is relatively inexpensive to conduct, is usually of short duration, and requires relatively few study subjects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain what a cohort study is, and what its goal is

A

Cohort studies are observational studies usually carried out over a number of years, and designed to investigate the aetiology of diseases or outcomes. The aim of such studies is to investigate the link between a hypothetical cause and a defined outcome.

Prior to undertaking a cohort study, investigators should seek statistical advice regarding the number of subjects needed in each group. Cohort studies originate with a hypothesis that the outcome (an infection or a disease) is caused by exposure to an infectious agent or event (risk factor).

Subjects exposed to the suspected risk factor (cases) and similar groups that have not been exposed (control) are identified (Figure 5.2 ). Often, a complete population sample (cohort) is followed prospectively over a period of time (usually a number of years) to identify the incidence of the outcome in both groups. These results are then analysed to determine if the group exposed to the risk factor has a higher incidence of disease than those not exposed.

Cohort studies are usually prospective but they can be performed retrospectively if there is a clearly documented point of first exposure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are advantages of cohort studies?

A

◆ The prospective design of the ‘standard’ cohort study provides an opportunity for
accurate data collection that is not normally available from retrospective studies.
◆ The incidence, relative risk, and attributable risk can be calculated from the results.
◆ An estimate of the time from exposure to disease development is possible
◆ Occasionally, cohort studies can be performed retrospectively and can thus be cheaper and less time-consuming.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are disadvantages of cohort studies?

A

◆ Time-consuming and costly (unless the outcome has a high incidence and short latent period).
◆ Long studies inevitably increase the drop-out rates.
◆ Cohort studies are not useful investigations for rare diseases as large numbers of
subjects are required.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are cross-sectional (prevalence) surveys

A

Cross-sectional studies are descriptive studies in which a sample population’s status is determined for the presence or absence of exposure and disease at the same time. These surveys take a ‘snapshot’ of the population and thus detect the presence of disease at a point in time (prevalence) as opposed to the frequency of onset of the disease (incidence).

The cases in a specified population can either be calculated during a given period of time (period prevalence) or at a given point in time (point prevalence).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an incidence rate?

A

Number of new cases in a time period, divided by population at risk, multiplied by a constant

constant e.g could be 1000 device days

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a prevalence rate?

A

Describes current status of active disease. A prevalence rate is used to describe the current status of active disease. It is a measure of the number of active (new and old) disease at any one time as the numerator and
the exposed population at that point as the denominator. The cases in a specified population can either be calculated during a given period of time (period prevalence) or at a given point in time (point prevalence).
It is sometimes helpful to review the incidence and prevalence simultaneously.

Number of current cases in a time period, divided by population at risk, multiplied by a constant

constant e.g could be 1000 device days

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is attack rate?

A

Attack rate is another type of incidence rate that is expressed as cases per 100 population (or as a percentage). It is used to describe the new and recurrent cases of disease that have been observed in a particular group during a limited time period in special circumstances, such as during an epidemic.

number of new and recurrent cases that occur in population in a specified time / population at risk x 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Measures of association are used during outbreak investigations to evaluate the relationship between exposed and unexposed populations. These statistical measures can express
the strength of association between a risk factor (exposure) and an outcome (disease).

What are ways to express risk?

A

relative risk

relative risk reduction

absolute risk reduction

odds ratio

For outbreak situations, relative risk or odds ratio are used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is risk ratio?

A

The risk ratio is the ratio of the attack rate (or risk of disease) in the exposed population to the attack rate (or risk of disease) in the unexposed population.

If the value of the risk ratio (relative risk) is equal to 1, the risk is the same in the two groups and there is no evidence of association between the exposure and outcome.

If the risk ratio is greater than 1, the risk is higher for the exposed group and exposure may be associated with the outcome.

If the risk ratio is less than 1, the risk is lower for the exposed group and the exposure may possibly protect against the outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is relative risk?

A

Relative risk provides an estimate of the chances of an exposed individual to develop
an illness, complication, or response to therapy in comparison with a non-exposed
individual.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is absolute risk?

A

The absolute risk is the risk in the exposed and the non-exposed group as a whole and the individual risk computes the risk according to the levels of exposure. However, one should remember that these chances have been calculated from observations on large groups of patients and the result of the group as a whole may not automatically apply to the patient that is presently sitting in front of you

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an odds ratio?

A

The odds ratio is similar to the risk ratio except that the odds, instead of the risk (attack rates), are used in the calculation.

It is the ratio of the probability of having a risk factor if the disease is present to the probability of having the risk factor if the disease is absent.

If the odds ratio is equal to 1, the odds of disease are the same if the exposure is present (i.e. there is no evidence of association between the exposure and disease).

If the odds ratio is greater than 1, the odds of disease are higher for the exposed group and the exposure is probably associated with the disease.

17
Q

What is definition of bias?

A

Bias refers to errors in study design and execution, and to interpretation and implementation of its results, which systematically influence the eventual outcome for the patient. Bias occurs in both quantitative and qualitative research and it can occur at
any stage from conception of a study through to marketing and implementation of its
results. Bias can be deliberate or unintentional.

The perfect study is one that is both accurate and precise without bias. An accurate study may be imprecise but not biased. A biased study can be precise but still be inaccurate.

18
Q

What are different types of bias?

A

Selection bias - e.g not randomising, or deliberately omitting a patient group

Information bias - data not collected similarly between groups

Statistical - using wrong analytical method

19
Q

What are confounders?

A

Confounders are factors extraneous to the research question that are determinants of the outcome of the study. If they are unevenly distributed between the groups they can influence the outcome.

A confounder need not be causal; it might be just a correlate of a causal factor. For example, age is associated with a host of disease processes but it is only a marker for underlying biological processes that are causally responsible for these diseases.

Similarly, the water pump disconnected by John Snow in Limehouse was not the cause of the cholera, just the conduit that delivered the causal agent.

20
Q

What are ways of dealing with confounders?

A

exclusion criteria
stratifying sampling
using correct analysis
randomisation of patients

21
Q

Measures of dispersion describe the distribution of values in a data set around the mean

What are common ways to describe dispersion?

A

range

deviation

variance

standard deviation

22
Q

What is the definition of each of these terms?

range

deviation

variance

standard deviation

A
  • range - difference between highest and lowest value in data set
  • deviation - difference between individual value, and mean value for data set
  • variance - measures deviation around mean of a distribution
  • standard deviation - reflects distribution of values around the mean
23
Q

standard deviation - reflects distribution of values around the mean

How is this normally represented?

A

Bell-shaped curve

69% of results fall within one standard deviation of the mean

  1. 5% of results fall within two standard deviations of mena
  2. 7% of results fall within three standard deviations of the mean
24
Q

What is the null hypothesis?

A

Investigator makes a hypothesis about the diferent data sets

null hypothesis predicts that two sets of data are not different.
Set an arbitrary limit of 0.05 (5%) P value

If data shows P value <0.05, then we can be more confident that there is a true difference between the data sets
However, aware that risk of this being by chance, is 5%

25
Q

Hypothesis testing

Errors can occur if there is bias of confounding factors

What are:

Type I (alpha) errors

Type II (beta) errors

A

Type I (alpha) errors - error occurs when an investigator states that there is an association when in fact there is no association, i.e. the investigator rejects a true null hypothesis.

Type II (beta) errors - error occurs when the investigator states that there is no association when in fact there is an association, i.e. the investigator fails to reject a null hypothesis that is actually false. Although these errors are not always avoidable, the likelihood of making a type II error can be minimized by using a larger sample size. By choosing the statistical cut-off level, the investigator decides before beginning the study what probability of committing a type I error can be accepted 
(usually 5 % ).
26
Q

What basic statistical test is used in outbreak situations?

A

The chi-square test is commonly used in outbreak investigations to evaluate the probability that observed differences between two populations, such as cases and controls, could have occurred by chance alone if an exposure is not truly associated with disease.

It is calculated by using two-by-two contingency tables (Table 5.1 ). Because it takes a lot of patience to calculate chi-squares by hand, most investigators opt to use a computer with a statistical software package. The chi-square test can be used if the number of subjects in a study is approximately 30 or more.

27
Q

Outbreak situation- you intend to use a chi-squared test

However, there is a very small population size (<5)

What statistical analysis should be used?

A

The Fisher exact test is used for evaluating two-by-two contingency tables, and is a variant of the chi-square test.

The Fisher exact test is the preferred test for studies with few subjects.

The formula for the Fisher exact test calculates the P value directly, so a table of chi-squares is not needed. However, in order to calculate the P value for the study, one must calculate the P value for the observations in the study and then add this P value to the P values of all possible combinations that have lower P values.

This calculation should be done with the aid of a computer.

28
Q

What is difference between using p value and confidence interval?

A

P value only states whether something is statistically significant. Not whether it is clinically significant

Confidence interval indicates clinical significance.
States that the true value (usually mean) falls within this interval 95% of the time
A narrow confidence interval expresses a strong strength of association, and therefore provides more overall information than a p value

29
Q

Calculating confidence intervals requires a representative sample from a normally distributed population

What does this mean?

A

Standard normal shaped bell curve

A normal distribution is one in which the values are evenly distributed both above and below the mean. A population has a precisely normal distribution if the mean, mode, and median are all equal. For the population of 3,4,5,5,5,6,7, the mean, mode, and median are all 5.

30
Q

What is definition of sensitivity?

what is definition of specificity?

A

Number of patients with the disease, that test positive, from total number of patients with the disease

Number of patients without the disease, that test negative, from total number of patients who do not have the disease

31
Q

What is positive predictive value?

A

Percentage of all test-positives, who have the disease

True positive / true positive + false positive x100

32
Q

What is negative predictive value?

A

Percentage of all test-negatives, who do not have the disease

True negative / true negative + false negative x100

33
Q

Statistical process control charts are used to monitor parameters over time - e.g PCR results

The mean of results (minimum 10-12) is usually graphed on a chart, and has rules when to flag up any potential errors

Where are these data points set?

A

warning limits - two standard deviations above or below mean