Type 1 and 2 error Flashcards

Question 1

Q

Statistical error

Answer

A

We can never be completely certain that we are right when we reject or fail to reject the null hypothesis

Type 1 error = rejecting the null hypothesis when it is CORRECT
Saying that means are different when they are the same FALSE POSITIVE

Type 2 error = failing to reject the null hypothesis when it is incorrect
Saying the means are the same when they are different FALSE NEGATIVE

Question 2

Q

Type 1 error = FALSE POSITIVE

problems

Answer

A

Overcall positive results
Identify a treatment effect when doesn’t exist
Waste of time and effort on further development of an ineffective drug
Consequences for patients

Question 3

Q

Type 2 error = FALSE NEGATIVE

problems

Answer

A

Overcall negative results
Fail to identify a treatment effect when one does exist
Reject /lose a potentially effective treatment
Waste resources used so far on drug development

Question 4

Q

Type 1 error rate

how more likely and how to reduce

Answer

A

Rejecting the null hypothesis when It is correct

Designated by a (alpha) usually set to 0.05
Implying that it is okay to have a 5% probability of incorrectly rejecting a true null hypothesis

Type 1 errors more likely with:
• Multiple tests – if we do 20 tests, one will falsely reject the null hypothesis
• Higher alpha values will cause error also

Reduced by:
• Pre-study analysis design – avoid multiple testing
• Setting a lower e.g., 0.01
• Reporting p values to 3 decimal places to give accurate probability estimate

Question 5

Q

Type 2 error rate

how more likely and how to reduce

Answer

A

FAILING to reject the null hypothesis when it is incorrect
Denoted by Greek letter B

‘POWER’ of a test is 1-B
• Power = likelihood of a statistical test detecting an effect when there is one
• Greater power = less likely to be a false negative result

Type 2 errors more likely with:
• Small samples
• Small effect size - hard to detect

Prevented by:
• Large sample size
• Larger effect size (choosing an outcome where you can measure better effect size)

Question 6

Q

Multiplicity

Answer

A

Performing many statistical tests on one clinical trial

Increases the risk of type 1 error (alpha)
• False positive result
• Rejecting null hypothesis when it is actually true
• Set for a single comparison at p<0.05

Question 7

Q

Risk of type 1 error calculation

Answer

A

Calculated by:
[1-(1-a)^n] where n is the number of tests

Type 1 error rate of <0.05 is accepted for a single test
Inappropriate for multiple tests

Question 8

Q

Multiplicity in clinical trials

5 examples

Answer

A

Multiple treatment
• More than 2 groups (drugs, doses, combinations)

Multiple endpoints
• Several outcomes of interest

Repeated measurements
• Measurements at multiple time points

Subgroup analyses
• Tests whether individuals with certain characteristics benefit more than those without (e.g. demographics, lifestyle)

Interim analyses
• Analysis of data that is conducted before data collection has been Conducted during the trial e.g. for ethical and economic reasons

Question 9

Q

Dealing with multiplicity

Answer

A

Make less comparisons
Pre-define/ prioritize the comparisons
Adjusting the p value

Question 10

Q

Make less comparisons - dealing with multiplicity

Answer

A

MULTIPLE TREATMENTS use analysis of variance (single omnibus test compares all treatments ar once rather than making multiple comparisons)
MULTIPLE ENDPOINTS
use single summary statistics e.g. questionnaire many questions but one score
MULTIPLE ENDPINTS use composite endpoint
e. g. MACE - occurrence of any fatal heart problem like stroke, heart attack etc all counts as 1
REPEATED MEASUREMENTS
Do analysis at predefined timepoint
or do summary measure - area under curve for example
or use statisical mixed model

Question 11

Q

pre-define. - dealing with multiplicity

Answer

A

Multiple treatments
– Pre-define the most important comparison

Multiple endpoints
– Specify primary and secondary endpoints in advance
• Study is powered to detect primary endpoint and outcome judged on the significance of the primary endpoint
Subgroup analyses
– Predefine a limited number of subgroups to be analyzed

Question 12

Q

adjust the p value - dealing with multiplicity

Answer

A

Type 1 error rate is inflated by multiple tests
– Reduce the p value threshold for individual tests
– Overall level of significance can be kept at 0.05 for entire series of tests

e.g. Bonferroni correction
– divide 0.05 by number of tests done to set significance level for each subtest
– e.g. for 5 related tests set a (risk of false positive result) at 0.05/5 = 0.01
– Very conservative, tends to overcorrect and increase the risk of a false negative result

Question 13

Q

No significance testing for baseline data

can avoid to reduce multiplicity

Answer

A

Multiple tests will generate false positive results
• e.g. 30 comparisons; 79% chance of false positive result

Differences may be clinical important but not statistically significant
• negative tests may be falsely reassuring

Comparisons not testing a useful scientific hypothesis

Question 14

Q

repeated measurements

Answer

A

Outcome variable measured two or more times for each participant over a period of time
e.g. before, during ad after

Question 15

Q

How to compare repeated measures between groups?

Answer

A

Compare final measurement?
• Wastes a lot of valuable information

Compare every timepoint?
• Multiple comparisons – risk type 1 error

Some kind of regression?
• Correlation structure leads to bias

Summary measure approach?
• Each measure has limitations

WE CAN USE REPEATED MEASURES MODEL (ANOVA) or summary measures

Question 16

Q

Tracking

Answer

Study These Flashcards

A

Baseline characteristics influence PK/PD so that measurement values vary from low to high
Values tend to track for an individual e/g/ start high, stay high, start low, stay low
There is strong correlation between repeated measures – ‘correlation structure’ hence cant do regression |/

Question 17

Q

• ANOVA (analysis of variance test) is an OMNIBUS TEST

Answer

Study These Flashcards

A

ONMNIBUS test – tests everything at once (variance of all variables) – avoids risk of multiplicity
However, the output just tells us there is a difference – doesn’t tell us what is different (which time points are different??)

Question 18

Q

• A post hoc test

Answer

Study These Flashcards

A

can tell us what is different - you can do this by estimated marginal means

It is okay for multiplicity for post hoc test because you have already proved there is a difference between the time points as a whole
Post hoc is exploratory analysis not your primary outcome

Question 19

Q

• Estimated marginal means

Answer

Study These Flashcards

A

an estimate of the means rather than the actual calculation of them
• We can compare them and compare the main effects
• The means are estimated from the regression model rather than calculated from data
• These are inferential stats not descriptive
• Means for groups adjusted for means of other factors in the model
• Also referred to as least square means

Question 20

Q

descriptive stats vs Least square means

Answer

Study These Flashcards

A

Least squares mean just means means have been estimated from model
Primary outcome is NOT significant
Why are there no p-values for secondary outcomes? Because primary outcome is not significant – so you dint explore stats on secondary outcome – you don’t give p value

Question 21

Q

Summary measure approach pros

Answer

Study These Flashcards

A

Summarises all the information as a single statistic
Reduced multiplicity
Avoids the problem of correlation structure
Makes interpretation easier

Question 22

Q

summary measure for repeated time measurements approach examples and limitations

Answer

Study These Flashcards

A

mean (central level of efficacy of outcome variable)
limitation - sensitive to missing info
maximum - (describe max drug concentration)
time to maximum (describe speed of drug)
limitation - sensitive to missing info
area under curve (assess overall conc of drug) (ignores within subject variations
percentage of time above below certain value (asses time that drug is effective)
number of occasions above or below certain value (assess frequency of fluctuations)
limitation - many time points needed for stable estimate
rate of change (rate of change in outcome variable)
limitation - coefficients are measured with varying levels of precision

Type 1 and 2 error Flashcards

(22 cards)