Tutorial 1 - Tools of the trade: understanding and interpreting the findings commonly reported in papers Flashcards

1
Q

Why is it not ideal to sample the whole population

A

We cannot measure every individual as it is time consuming, expensive and takes time- virtually impossible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we get around not being able to sample the entire population

A

We take samples of the population.

Generalise and interpret the results to make conclusions for the whole population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does statistics help to overcome

A

Sources of variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Is one sample relevant on its own

A

No

We need to use statistical tools to generalise for the whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe confidence intervals

A

The PCT’s estimate of their population’s smoking prevalence is 28% from their sample, but there will be some uncertainty around this estimate. We express this uncertainty using a 95% confidence interval (95% CI) around the estimate, e.g. 19% to 37%. This means that if we repeated the sampling 100 times, we would expect the true prevalence of smoking in the PCT to fall within the CI in 95 of the 100 samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we assess whether differences are due to chance (sampling error) or by a real difference in prevalence

A

This is done statistically by setting up a null hypothesis of no difference and looking for evidence to disprove it: what is the likelihood that our two samples were 28% and 21% if the two true underlying prevalences were the same? We then choose the appropriate statistical test (e.g. chi-squared test to compare the two proportions) to get this likelihood, which is the P value. The lower the P value, the less likely that our estimated difference is a chance finding. Suppose the P value was 0.014. Convention has it that if P<0.05 (and this is an arbitrary cut-off!) then we can reject the null hypothesis and conclude that the smoking prevalence fell after the campaign. Such a result is called statistically significant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Are statistically significant results more or less likely with small sample size than with large sample sizes?

A

The larger the sample size, the more information we have and so uncertainty reduces, more likely to be statistically significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the formula for the chi squared test

A

X= sum of (Observed-Expected) squared/ expected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do we calculate the odds ratio

A

e.g. the odds of exposure is the number of people who have been exposed divided by the number of people who have not been exposed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which type of studies can be used to calculate odds ratio

A

– the relative risk can be calculated from cohort studies, since the incidence of disease in the exposed and non-exposed is known. In case-control studies, however, the subjects are selected on the basis of their disease status (sample of subjects with a particular disease (cases) and sample of subjects without that disease (controls)), not on the basis of exposure. Therefore, it is not possible to calculate the incidence of disease in the exposed and non-exposed individuals. It is, however, possible to calculate the odds of exposure. The odds ratio (of exposure) is the ratio between two odds, e.g. the odds of exposure in the case s divided by the odds of exposure in the controls.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain how an odds ratio may be a good estimate of the relative risk

A

This ratio is the measure reported in case-control studies instead of the relative risk. It can be mathematically shown that the odds ratio of exposure is generally a good estimate of the relative risk. An odds ratio of 1 tells us that exposure is no more likely in the cases than controls (which implies that exposure has no effect on case/control status); an odds ratio greater than 1 tells us that exposure is more likely in the case group (which implies that exposure might increase the risk of the disease). An odds ratio less than 1 tells us that exposure is less likely in the case group (which implies that exposure might have a protective effect).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do we calculate relative risk

A

the relative risk is used as a measure of association between an exposure and disease. It is the ratio of the incidence rate in the exposed group and the incidence rate in the non-exposed group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we interpret relative risk

A

A value of 1.0 indicates that the incidence of disease in the exposed and the unexposed are identical and thus the data shows no association between the exposure and the disease. A value greater than 1.0 indicates a positive association or an increased risk among those exposed to a factor. Similarly, a relative risk less than 1.0 means there is an inverse association or a decreased risk among those exposed, i.e. the exposure is protective.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe attributable risk and fraction

A

The attributable risk for lung cancer in smokers is the rate of lung cancer amongst smokers minus the rate of lung cancer amongst non-smokers (i.e. the risk difference). It gives an indication of how many extra cases for which the exposure is responsible, making the important assumption that the relation between the exposure and the disease is causal (i.e. not explained by other confounding factors – see below). The attributable risk and related measures are typically used to help guide policymakers in planning public health interventions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the structure of a null hypothesis for a case-control study

A

That the odds of taking HRT in women who had had an MI are the same as the odds of taking HRT in women who had not had an MI, i.e. the odds ratio equals 1. This would mean that taking HRT does not affect your chances of getting an MI (at least in the age range of those studied here)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a 0.72 odds ratio mean in words

A

The odds of taking HRT in women who have had an MI is 0.72 times the odds of taking HRT in women without an MI.
The risk of acute myocardial infarction is reduced by ~30% in women who currently or recently used HRT.

17
Q

How do we define a normal distribution

A

Mean and SD

18
Q

How do we design a null hypothesis when comparing means

A

That the mean systolic BP of the two groups would be the same after six months of follow up: i.e. after six months, the mean BP of the intervention group minus the mean BP of the control group would be 0.

19
Q

How de we assess whether to accept or reject a null hypothesis when comparing means

A

Yes – the CI does not include 0, the value expected under the null hypothesis. The mean difference of 4.3 mmHg is unlikely to be due to chance, suggesting a genuine difference between the two groups. However, this difference might be explained by other factors (bias and confounding).

20
Q

What is the purpose of randomisation

A

To try to ensure that possible confounders were equally distributed across the groups. This randomisation is one method for controlling for unknown confounding variables at the design stage of a study.

21
Q

What is the purpose of stratification, give an example

A

All the people with diabetes were divided evenly (with each person allocated at random to a particular group) between the groups so that any treatment effect would not be due to a disproportionate number of people with diabetes, who have a higher BP on average than people without, in one group or the other. This is an example of controlling for confounding in the design stage.

22
Q

What can an incidence in a sample tell you about the incidence in the population

A

Because breast cancer risk varies with age, these data cannot tell us anything about the incidence of breast cancer in the whole female population of the UK. However, assuming that a fairly random selection of women in this age group attend the screening, these data can provide a pretty good estimate of the incidence in women aged between 50 and 70.

23
Q

Why may the odds ratio and relative risk values sometimes be the same

A

The odds ratio is an estimate of the relative risk, and for rare outcomes (such as cancers), these measures of effect will give similar estimates of risk. However, this is not the case for common outcomes, or for rare outcome studied over very long periods of time.

24
Q

How would you describe in words a relative risk value

A

If you are exposed to aromatic amines at work your risk of getting bladder cancer is 297 times higher than if you were not exposed to aromatic amines.

25
Q

How would you interpret a population excess fraction value

A

Assuming causality, 98.7 percent of the bladder cancer cases in the study population can be attributed to occupational exposure to aromatic amines.

26
Q

What is population excess fraction influenced by

A

The population excess fraction is not just influenced by the relative risk associated with exposure, it is also dependent on the prevalence of exposure in the population being studied, as well as on the underlying incidence of the disease in the population.

27
Q

What is the most useful measure of risk – the relative or the absolute (excess fraction) risk

A

It depends on what you want to know. If you want to identify the risk factors for a disease, the relative measure tells you what you need to know (i.e. how many times more likely are the exposed compared with the non exposed people to develop the outcome of interest). If you can establish that an exposure is causally associated with a disease, you can then think about the impact of exposure on the incidence of the disease in the population (attributable or excess risk).

28
Q

What are the limitations of relative and absolute risk

A

There are several limitations to reporting population attributable risks and excess fractions. You need to know that the exposure is causally related to the outcome of interest; you need to know that there is no bias or confounding that might influence the risk measure; and if you want to extrapolate the findings from a specific cohort, you also need to be sure that the study population is generalisable to the wider population. It is difficult to satisfy all these criteria, and as a result attributable risks should be looked upon as a best guess of the impact of exposure. You should also be aware that by calculating population excess fractions individually (e.g. for smoking, occupational exposure, diet), and ignoring the fact that many risk factors interact with each other, the percentages can add up to more than 100%. In the occupational cohort above, 98.7% of the bladder cancer cases were attributed to aromatic amine exposure; however 50.7% of the same cases were attributed to smoking!

29
Q

Describe a cohort study

A

Start with the exposed and unexposed. Follow over time, see incidence of disease. Determine risk factors.

30
Q

Describe case-control

A

Individuals without compared to those with, ask about exposures calculate OR.

31
Q

How do you express attributable risk as a fraction

A

For example, if 20 out of 100 smokers got lung cancer (in a given period of time) compared with 5 out of 100 non-smokers, the relative risk (see below) would be 20/5 = 4, but the attributable risk would be (20 – 5)/100 = 15 per 100. This may also be expressed as an excess fraction; 15 per 100/20 per 100 = 75%. Of the 20 cases of lung cancer in the smoking population, 15 of them (75%) could be attributed to smoking.

32
Q

What is attributable risk useful for

A

The attributable risk is especially useful in evaluating the impact of introduction or removal of risk factors. Its value indicates the number of cases of the disease among the exposed group that could be prevented if the exposure were completely eliminated.

33
Q

What is meant by confounding

A

a possible explanation for the study finding if confounding variables have not been taken into account in the study.

34
Q

What is meant by confounding variable

A

a factor that is associated with both the exposure and outcome of interest. Common confounders include age, smoking, socio-economic deprivation. Smoking is a confounder because smoking tends to be more prevalent in people exposed to non-tobacco-related toxins and carcinogens, and also more prevalent in people with a range of diseases.

35
Q

Describe matching

A

a method for “controlling for” (i.e. effectively removing) the effect of confounding at the design stage of a case-control study; controls are selected to have a similar distribution of potentially confounding variables to the cases, e.g. they are said to be “matched” for sex if there are similar proportions of men and women in both groups.

36
Q

Describe standardisation

A

a method for controlling the effect of confounding at the analysis stage of a study. Used to produce a Standardised Mortality Ratio, a commonly used measure in epidemiology.

37
Q

Describe restriction

A

a method for controlling the effect of confounding at the design stage of a study, e.g. by including patients in a clinical trial only between the ages of 18 and 65 without pre-existing illness so that the results of the trial are not confused (‘confounded’) by different levels of age or morbidity in the two treatment groups.

38
Q

Describe odds

A

the odds is another way to express probability e.g. the odds of exposure is the number of people who have been exposed divided by the number of people who have not been exposed. The mathematical relationship between odds and probability is: Odds = probability / (1 – probability)