Finals Review Flashcards
Describe the primary goals of Phase 1 drug trials?
- Goal: Determine which dose of drug is safe and most likely to show benefit
- Estimate largest size of a dose before unacceptable toxicity is experienced by patients (Maximally tolerated dose – MTD)
- Start with a low dose and escalate until a prespecified level of toxicity is achieved
Describe what is meant by the efficacy/toxicity trade-off.
A higher dose will likely have a higher efficacy but also a higher toxicity so a Phase I trial works to find the balance of a dose that is high enough to be effective but not have too many toxicity event (the MTD)
In a 3+3 design, how is the maximum tolerated dose determined?
Treat 3 participants at dose K
- If no DLT escalate to dose level k+1
- If 2+ DLTs, de-escalate to dose level k-1
- If 1 DLT, treat 3 additional participants at dose level K
– If 1 in 6 DLT, escalate to dose level K+1
– If 2 in 6 DLT, de-escalate to dose K-1
- MTD is the highest dose where 0 or 1 DLT is observed (repeat as needed)
What are some strengths and limitations of the 3+3 design?
Strengths: Easy, small sample, doesn’t require statistician
Limitations:
o Ignores dose history other than previous 3 patients
o Imprecise and inaccurate MTD estimation
o Low probability of selecting true MTD
o High variability in MTD estimates
o Dangerous outcomes
What are some strengths and limitations of the Continuous Reassessment Model (CRM)
Strengths: Relatively Precise way to determine the MTD
Limitations: Requires a statistician and modelling, requires assumptions
**ADD TO THIS
Write down the null and alternative hypothesis for a futility trial
- Formulation of the null and alternative hypotheses are reversed. Higher alpha (flip alpha and beta so alpha about 0.2 and beta about 0.05)
Continuous:
- Futility Trial. Null: New Tx Mean >= Control Mean + Delta
- Futility Trial. Alt: New Tx Mean < Control Mean + Delta
- Futility Trial. Reject null: New Tx is futile; Do not move forward. Reject is bad.
- Standard. Null: New Tx Mean = Control Mean
- Standard: Alt: New Tx Mean > Control Mean
- Standard. Reject null: New Tx is effective. Reject is good.
Binary:
- Futility Trial. Null: New Tx Prop <= Control Prop - Delta
- Futility Trial. Alt: New Tx Prop > Control Prop - Delta
- Futility Trial. Reject null: New Tx is futile; Do not move forward. Reject is bad.
- Standard. Null: New Tx Prop = Control Prop
- Standard: Alt: New Tx Prop < Control Prop
- Standard. Reject null: New Tx is effective. Reject is good.
What are some specific features of futility designs that typically result in a smaller sample size compared to conventional efficacy designs?
- One sided (or historical controls)
- Looking to detect larger differences
- We are less concerned about drawing false-positive ( type I error) conclusions that ineffective treatments may be effective because treatments that are not determined to be futile in phase II would be tested further in phase III trials with smaller error probabilities at the expense of larger sample sizes
- Lower power (higher beta), smaller sample size
- **ADD TO THIS AND MAKE SURE IT’S CORRECT (but basically, proving futility is a smaller claim than proving efficacy so it does not require as many participants)
Describe what the differences are in Type I and Type II errors between conventional designs and futility designs.
Conventional: Type I Alpha: Ineffective Therapy is Effective (falsely rejecting H0).
Conventional: Type II error: Effective therapy is ineffective (failed to reject H0).
Futility: Type I: Effective therapy is ineffective (falsely rejected H0 and called an effective treatment futile).
Futility: Type II: Ineffective Therapy is Effective (failed to reject the null).
For futility we have a higher alpha and lower beta
In cluster randomized trials, assume there is a positive correlation in the primary outcome for individuals within a cluster. Describe the impact on the alpha level if you ignore the design effect and analyze as if they are independent.
- alpha type I error will increase ; pvalue biased downwards – false positive error rate has increased (alpha)
- POSITIVE INTRACLASS CORRELATION REDUCES VARIATION AMONG MEMBERS OF THE SAME GROUP so failing to acknowledge that will decrease variance which inflates Z/T score, which decreases pvalue
If you ignore the correlation (Which has decreased the variance) you will divide by smaller variance estimate so the t/z statistic will be larger/more extreme so the pvalue will be smaller. So the design effect inflates the variance accordingly to correct the pvalue.
Based on a figure of point estimates and confidence intervals, be able to explain whether one would conclude superiority, noninferiority, inconclusive, or inferior treatment.
Blue area is the region of non-inferiority. Usually CI will be pretty large because of small sample size. So unless sample size is very large or margin of non-inferiority delta is too large, it is unlikely to have a scenario “D” where the entire CI is between 0 and delta.
- Be able to define what the margin of non-inferiority is. Also, be able to comment on the choice of this margin compared to an active control’s effect. For example, if an active control has a delta difference with placebo, in a non-inferiority trial for a new treatment compared to that active control, describe how the margin of non-inferiority may compare to this delta and the rationale.
- Margin of non-inferiority Delta: specified in protocol, maximum difference in responses between two interventions that is considered clinical unacceptable (ie* ½ or 1/3 of the established superiority) – retain a certain proportion of the active control’s efficacy
• Placebo – 40 % mortality
• AC – 30% mortality
• Margin – ½ then will accept 5% or 35% mortality - Non-inferiority is comparing to an active control but we need to consider the effectiveness of active control as compared to the placebo in choosing the margin
- Must be very valid. Clear protocol and carefully, rigorously conducted (minimal drop out, non-compliance, missing data that might bias the results toward the null in a non-inferiority study because they will look more like “placebo” if they’re not taking a treatment). ESPECIALLY IMPORTANT FOR NON-INFERIORITY BECAUSE IT WILL BIAS TOWARDS THE NULL FOR NON-INFERIORITY WHILE IT WOULD BIAS AWAY FROM THE NULL IN EFFECTIVENESS. This matters because it would lead you to be more likely to call a treatment INCREASES PROBABILITY OF TYPE II ERROR.
- Have to determine the margin of non-inferiority delta – maximum difference between two interventions that is considered clinically acceptable
- Placebo 40% mortality; Active control 30% mortality; Margin: 35%-30%=5% or 37%-30%=7%? Definitely cannot go to 10% margin because then you’re at placebo. Need to establish what is acceptable.
- Describe why it’s reasonable to plan for testing non-inferiority and then superiority, but not the other way around.
CANNOT test superiority first and then non-inferiority
• Non-inferiority will have a smaller sample size
o Looking for a smaller effect size for non-inferiority than superiority (superiority is non-inferiority + effectiveness)
Superiority margin is from intervention vs placebo
• Larger
Non-Inferiority is for Intervention vs control
• Smaller
o Non-inferiority is one sided while standard/superiority will be two sided.
• Test order has to do with type I error as well. If superiority first, if it fails it would not move on to the non-inferiority test. Has to pass the first test to move to the second.
o If first do non-inf before trying sup. Can detect both non-inf and sup. Non-inf is gate keeper. Then test for sup.
o If do sup first and it fails, we would not move on to non-inferiority. This is due to Type I errors. Can only move on to next sequential test if current test is successful.
• Power issue has to do with the margin and the effect size being smaller (ie* harder to detect) for superiority. Thus sample size must be higher for superiority than non-inferiority. WILL ALSO BE UNDERPOWERED FOR ONE OF THE TESTS IF THE DELTAS ARE NOT EQUAL.
Describe the intention-to-treat principle.
As randomized so analyze. If not doing this need to specify and justify
Describe the possible effect of dichotomization of a time to event/continuous outcome on the power of a trial.
Loss of power (Would need to increase sample size). Can result in easier to interpret results
Describe what is meant by sub-group analyses.
- Who does the treatment work best for
- NIH mandates comparison of sex and racial/ethnic groups
- Usually done to see if the treatment is effective in specific groups (usually for marginally unsuccessful results)
- ALWAYS test for interaction with treatment
• Don’t interpret the main effects when you’re looking at the interaction - Still considered post-hoc even when pre-specified
- Only do subgroup analysis if interaction effect is present (usually pvalue will be higher because of lower power for the interaction test so increase alpha to maintain power)