Topic 8: Multiplicity Flashcards

1
Q

What is multiplicity

A

Multiplicity is the term to describe the increased risk of false positive (type 1) conclusions that arises when multiple statistical tests are carried out on a data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why does multiplicity occur?

A

Doing multiple statistical tests for multiple hypotheses means you have multiple effect estimates and multiple p values. Each test is performed with a small chance of error and you look across tests to make a conclusion, which increases the type one error rate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is family wise error rate

A

Term to describe the increased error across p values associated with multiple tests contributing to your conclusion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the family wise type 1 error

A

The chance of a false positive conclusion in the trial as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is required of the family wise type 1 error rate in order for trial outcomes to influence practice

A

It needs to be controlled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the p-value

A

The p-value arising from a statistical test represents the probability that the observed difference is due to chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

For two tests, each with a 95% prob of no error, what is the overall chance of error

A

0.95*0.95 = 0.9025. So overall chance of error is 9.75% (increased from 5%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

For multiplicity, is it the number of p-values contributing to the conclusion or the number of p-values calculated

A

p-values contributing to your conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Is multiplicity an issue when you have multiple outcomes

A

Not if there is still only one hypothesis test contributing to the conclusion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is data-dredging

A

A lot of tests carried out in order to try and find something important

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can we avoid accusations of data dregdging

A

Don’t do any unplanned statistical tests. Do no more or less tests than those that have been planned in the trial protocol.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which tests in a trial need to be reported

A

All of them, regardless of whether they have positive outcomes or not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Do secondary analysis tests need to be reported too even if they don’t feed into the main conclusio

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What can only reporting positive outcomes lead to

A

Reporting bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does reporting bias relate to multiplicity

A

It may not be clear if the results reported relate to all conducted tests, so its unclear how much of what has been reported has occured by chance of just excessive testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When is multiplicity an issue in a trial with multiple outcomes

A

If only one or the other outcome is required to significant, as opposed to both needing to be significant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When is multiplicity NOT an issue in a trial with multiple outcomes

A

When both are required to be significant, or if they are ordered so that the second is only tested if the first is significant. This means that the conclusion is only based on one result: effectiveness on both outcomes. Multiplicity is not an issue since the overall chance of error is not inflated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a hierarchical drug trial

A

When you’re comparing two doses of the same drug, but the lower dose would only be considered if the higher dose was found to be effective.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are two ways to overcome multiplicity for having multiple outcomes

A

Hierarchical strategy, you make conclusions in a set ordered way rather than conducting statistical tests for all hypothesis. Could also use a composite outcome.

20
Q

What is a composite outcome

A

Multiple end-points combined

21
Q

What is the issue with using composite outcomes to avoid multiplicity

A

They may be hard to interpret in terms of what the treatment differences are, what it relates to, and what the main driver of the treatment difference is.

22
Q

Why is using multiple treatment arms beneficial

A

You can answer two questions with only 50% more patients by having two interventions and a control - efficient.

23
Q

When is multiplicity an issue in trials with multiple treatment arms

A

When the conclusions simultaneously make reference to more than one treatment comparison - the chance of false positive claim is increased

24
Q

In what kind of multiple treatment arm trial is multiplicity particularly an issue

A

When arms are added and removed throughout the trial, so you don’t know at the start, which treatment arms will be used.

25
Q

How can you get around multiplicity in a multiple arm trial

A

Use a hierarchical design

26
Q

Give an example of a time where you might repeat analysis at different time points

A

A trial that has an interim analysis and then a final analysis.

27
Q

Give 3 reasons you might stop a trial early

A

Overwhelming evidence of improvement in efficacy. A lack of efficacy. Futility - a small chance the trial will show efficacy if you continue.

28
Q

What is an adaptive trial

A

When you can make changes throughout the trial, give it interim looks to make changes and check the original assumptions that were used to power the trial.

29
Q

In what phases are adaptive trials more common

A

Earlier phase trials

30
Q

Why is multiplicity an issue with repeating analysis at different time points/interim analyses.

A

The more looks at the data, the more tests, and the greater the chance of making a type 1 error.

31
Q

What does testing for interactions do to the power of the test

A

Decreases the power of each test - need more participants for equivalent power.

32
Q

Why do subgroup treatment estimates have wider confidence intervals than overall treatment effect

A

you normally need more patients to detect an interaction term than you would for a standard trial, and the study has been powered for the overall treatment effect.

33
Q

What did subgroup analyses use before using treatment interactions for analysis

A

Do a set of analyses on each subgroup and compare p-tests from the same test on different populations.

34
Q

How do you avoid claims of data dredging when doing subgroup analyses

A

Pre-specify the groups used in subgroup analyses and have them be based on clinical rationale - avoids suspicion that you have done lots of tests and selected the strongest one.

35
Q

What does it mean that subgroup analyses are considered hypothesis generating

A

May be used to direct subsequent research

36
Q

How is multiplicity corrected

A

Splitting the type 1 error rate between analyses so the overall type 1 error doesn’t exceed the planned level.

37
Q

What are the two ways of splitting type 1 error to control for multiplicity

A

Equally so that each test has an equal chance of significance, or so that less of the error is used on some analyses than others.

38
Q

What kind of trials usually split the alpha equally

A

Multiple endpoints, ,multiple treatments and subgroup analyses

39
Q

Which kind of trials usually split alpha unequally

A

Multiple time points. It is usually useful to use less error on interim analysis than final analysis - also means that only extremely certain results at interim analysis will stop a trial for efficacy.

40
Q

True/False: Correcting multiplicity has no effect on the power

A

False

41
Q

Give 3 corrections to control for type 1 error, when the hypotheses are of equal importance

A

Bonferroni, Hom, Hochberg

42
Q

What does the bonferroni correction do

A

split error equally over independent tests

43
Q

What is the disadvantage of the Bonferroni correction

A

Overly conservative.

44
Q

What does the Holm correction do

A

Sequentially rejective multiple test procedure with a stepwise nature based on bonferroni correction. Order p values smallest to largest. Compare (is p less than it) the most significant hypothesis against alpha/n - which is used in bonferroni - then compare second most significant to (alpha/n-1). Third significant to (alpha/ n-2) and so on. As soon as we fail to reject the null, stop and reject all remaining null hypotheses.

45
Q

True/False: The Holm correction is uniformly more powerful than the bonferroni correction

A

True

46
Q

Which of the Holm and Hocheberg is step up procedure and which is step dow

A

Holm is step up, Hochberg is step down.

47
Q

How does the Hochberg Correction work

A

order p values smallest to largest. Take the least significant (largest p) and compare to alpha/n. Take the 2nd largest p and compare to alpha/n-i+1. When p < specific alpha value, comparison stops and you conclude the hypothesis for that p and all other null hypothesis for p less than it are rejected.