Topic 8: Multiplicity Flashcards by Chloe Gaul

What is multiplicity

Multiplicity is the term to describe the increased risk of false positive (type 1) conclusions that arises when multiple statistical tests are carried out on a data set.

How well did you know this?

Not at all

Perfectly

Why does multiplicity occur?

Doing multiple statistical tests for multiple hypotheses means you have multiple effect estimates and multiple p values. Each test is performed with a small chance of error and you look across tests to make a conclusion, which increases the type one error rate.

How well did you know this?

Not at all

Perfectly

What is family wise error rate

Term to describe the increased error across p values associated with multiple tests contributing to your conclusion.

How well did you know this?

Not at all

Perfectly

What is the family wise type 1 error

The chance of a false positive conclusion in the trial as a whole

How well did you know this?

Not at all

Perfectly

What is required of the family wise type 1 error rate in order for trial outcomes to influence practice

It needs to be controlled.

How well did you know this?

Not at all

Perfectly

What is the p-value

The p-value arising from a statistical test represents the probability that the observed difference is due to chance.

How well did you know this?

Not at all

Perfectly

For two tests, each with a 95% prob of no error, what is the overall chance of error

0.95*0.95 = 0.9025. So overall chance of error is 9.75% (increased from 5%)

How well did you know this?

Not at all

Perfectly

For multiplicity, is it the number of p-values contributing to the conclusion or the number of p-values calculated

p-values contributing to your conclusion

How well did you know this?

Not at all

Perfectly

Is multiplicity an issue when you have multiple outcomes

Not if there is still only one hypothesis test contributing to the conclusion.

How well did you know this?

Not at all

Perfectly

What is data-dredging

A lot of tests carried out in order to try and find something important

How well did you know this?

Not at all

Perfectly

How can we avoid accusations of data dregdging

Don’t do any unplanned statistical tests. Do no more or less tests than those that have been planned in the trial protocol.

How well did you know this?

Not at all

Perfectly

Which tests in a trial need to be reported

All of them, regardless of whether they have positive outcomes or not.

How well did you know this?

Not at all

Perfectly

Do secondary analysis tests need to be reported too even if they don’t feed into the main conclusio

Yes

How well did you know this?

Not at all

Perfectly

What can only reporting positive outcomes lead to

Reporting bias

How well did you know this?

Not at all

Perfectly

How does reporting bias relate to multiplicity

It may not be clear if the results reported relate to all conducted tests, so its unclear how much of what has been reported has occured by chance of just excessive testing

How well did you know this?

Not at all

Perfectly

When is multiplicity an issue in a trial with multiple outcomes

If only one or the other outcome is required to significant, as opposed to both needing to be significant.

How well did you know this?

Not at all

Perfectly

When is multiplicity NOT an issue in a trial with multiple outcomes

When both are required to be significant, or if they are ordered so that the second is only tested if the first is significant. This means that the conclusion is only based on one result: effectiveness on both outcomes. Multiplicity is not an issue since the overall chance of error is not inflated.

How well did you know this?

Not at all

Perfectly

What is a hierarchical drug trial

When you’re comparing two doses of the same drug, but the lower dose would only be considered if the higher dose was found to be effective.

How well did you know this?

Not at all

Perfectly

What are two ways to overcome multiplicity for having multiple outcomes

Study These Flashcards

Hierarchical strategy, you make conclusions in a set ordered way rather than conducting statistical tests for all hypothesis. Could also use a composite outcome.

What is a composite outcome

Study These Flashcards

Multiple end-points combined

What is the issue with using composite outcomes to avoid multiplicity

Study These Flashcards

They may be hard to interpret in terms of what the treatment differences are, what it relates to, and what the main driver of the treatment difference is.

Why is using multiple treatment arms beneficial

Study These Flashcards

You can answer two questions with only 50% more patients by having two interventions and a control - efficient.

When is multiplicity an issue in trials with multiple treatment arms

Study These Flashcards

When the conclusions simultaneously make reference to more than one treatment comparison - the chance of false positive claim is increased

In what kind of multiple treatment arm trial is multiplicity particularly an issue

Study These Flashcards

When arms are added and removed throughout the trial, so you don’t know at the start, which treatment arms will be used.

How can you get around multiplicity in a multiple arm trial

Use a hierarchical design

Give an example of a time where you might repeat analysis at different time points

A trial that has an interim analysis and then a final analysis.

Give 3 reasons you might stop a trial early

Overwhelming evidence of improvement in efficacy. A lack of efficacy. Futility - a small chance the trial will show efficacy if you continue.

What is an adaptive trial

When you can make changes throughout the trial, give it interim looks to make changes and check the original assumptions that were used to power the trial.

In what phases are adaptive trials more common

Earlier phase trials

Why is multiplicity an issue with repeating analysis at different time points/interim analyses.

The more looks at the data, the more tests, and the greater the chance of making a type 1 error.

What does testing for interactions do to the power of the test

Decreases the power of each test - need more participants for equivalent power.

Why do subgroup treatment estimates have wider confidence intervals than overall treatment effect

you normally need more patients to detect an interaction term than you would for a standard trial, and the study has been powered for the overall treatment effect.

What did subgroup analyses use before using treatment interactions for analysis

Do a set of analyses on each subgroup and compare p-tests from the same test on different populations.

How do you avoid claims of data dredging when doing subgroup analyses

Pre-specify the groups used in subgroup analyses and have them be based on clinical rationale - avoids suspicion that you have done lots of tests and selected the strongest one.

What does it mean that subgroup analyses are considered hypothesis generating

May be used to direct subsequent research

How is multiplicity corrected

Splitting the type 1 error rate between analyses so the overall type 1 error doesn't exceed the planned level.

What are the two ways of splitting type 1 error to control for multiplicity

Equally so that each test has an equal chance of significance, or so that less of the error is used on some analyses than others.

What kind of trials usually split the alpha equally

Multiple endpoints, ,multiple treatments and subgroup analyses

Which kind of trials usually split alpha unequally

Multiple time points. It is usually useful to use less error on interim analysis than final analysis - also means that only extremely certain results at interim analysis will stop a trial for efficacy.

True/False: Correcting multiplicity has no effect on the power

False

Give 3 corrections to control for type 1 error, when the hypotheses are of equal importance

Bonferroni, Hom, Hochberg

What does the bonferroni correction do

split error equally over independent tests

What is the disadvantage of the Bonferroni correction

Overly conservative.

What does the Holm correction do

Sequentially rejective multiple test procedure with a stepwise nature based on bonferroni correction. Order p values smallest to largest. Compare (is p less than it) the most significant hypothesis against alpha/n - which is used in bonferroni - then compare second most significant to (alpha/n-1). Third significant to (alpha/ n-2) and so on. As soon as we fail to reject the null, stop and reject all remaining null hypotheses.

True/False: The Holm correction is uniformly more powerful than the bonferroni correction

True

Which of the Holm and Hocheberg is step up procedure and which is step dow

Holm is step up, Hochberg is step down.

How does the Hochberg Correction work

order p values smallest to largest. Take the least significant (largest p) and compare to alpha/n. Take the 2nd largest p and compare to alpha/n-i+1. When p < specific alpha value, comparison stops and you conclude the hypothesis for that p and all other null hypothesis for p less than it are rejected.

Topic 8: Multiplicity Flashcards

(47 cards)