Covariates and confounders Flashcards
Covariates
Baseline characteristics of participants that explain part of the variability in outcome
Why does this matter in clinical trials?
Imbalances between trial groups may bias estimate of treatment effect
• This could be positive (overestimate) or negative (underestimate)
• e.g. if drug X works in men not women and there are more men in the treatment than placebo group this may overestimate the overall treatment effect
Differences in treatment effect between subgroups may be clinically important
• e.g. if drug X works in men but not women this is useful information in healthcare
Confounders
- Variables related to both intervention and outcome but not on the causal pathway
- Distort the effect of the intervention on the outcome, create bias
Why does this matter in clinical trials?
• Imbalances between trial groups may bias estimate of treatment effect
• This could be positive (overestimate) or negative (underestimate)
• e.g. patients taking drug Y were more likely to develop abnormal liver function tests than patients taking placebo
• On investigation more patients in the treatment group drank alcohol to excess than those in the placebo group
• On subgroup analysis, stratified by alcohol consumption, there was no difference in liver function tests between treatment and placebo groups; alcohol was a confounder that had positively biased the estimate of the safety outcome
Managing covariates and confounders
Reduce bias of treatment effect
– Randomization
• Stratification by most influential covariates or confounders
– Adjust analysis for covariates or confounders
• Stratification or regression
• Reduces their impact on estimates of treatment effect
Identify significant covariates
– Subgroup analysis
• Identify subgroups of participants with different benefits/risks from treatment
Stratification
- Sorting participants into groups e.g. by confounders or covariates
- Can stratify at randomization – or if this fails during analysis
• Stratification in analysis
– Investigate effects between intervention and outcome within subgroups
• Covariate – relationship will persist
• Confounder – relationship will disappear
Regression
• Unadjusted analysis
– Estimate of treatment effect with no account taken of covariates or confounders
• Adjusted analysis
– Estimate of treatment effect taking into account covariates and confounders
• Ideally pre-specified to reduce potential bias by ‘fishing’
• Achieved using regression models
– Factors chosen for adjustment
• Strong predictors of outcome e.g. correlated with r>0.50
• Clear ‘large’ imbalance despite randomization
Binary clinical trial end points
Events
Disease onset or flare
• e.g. heart or asthma attack, epileptic seizure, arthritis flare, stroke, COVID infection
Healthcare contact or not, may specify:
• Scheduled or unscheduled
• GP, hospital, emergency room
Survival or death, may specify:
• Disease-free survival, overall survival
Composite endpoints – occurrence of not
• e.g. Major adverse cardiac events (MACE) – first occurrence of any of nonfatal stroke, nonfatal myocardial infarction or cardiovascular death
Risk
Probability of an event occurring over a pre-specified time interval
Absolute risk (AR)
• Number of people with event/total number of people
Relative risk (ratio)
- Absolute risk INTERVENTION group/absolute risk COMPARATOR group
- No difference – relative risk = 1, treatment better than control, relative risk <1
Absolute risk reduction (ARR)
- Absolute risk COMPARATOR group MINUS absolute risk INTERVENTION group
- No difference ARR = 0
Relative risk reduction (RRR)
AR COMPARATOR - AR INTERVENTION)/ARCOMPARATOR
• No difference RRR = 0
Number needed to treat
• Number treated (100)/absolute risk reduction (for percentage, number treated is 100)
95% confidence interval
95% chance that the true value is between these numbers
Descriptive vs inferential statistics
• Descriptive statistics – Summarize characteristics of a dataset • e.g. mean, standard deviation – No uncertainty • Inferential statistics – Calculated from descriptive statistics • Estimate of population values • Allow hypothesis testing – Uncertainty
Estimates
- Point estimate – single value estimate of a parameter (e.g. sample mean as point estimate of population mean)
- Interval estimate – a range of values within which the parameter is expected to lie (confidence intervals for example)
Confidence intervals
- Interval estimate – a range of values where the parameter is expected to lie
- Used with and tell you the uncertainty of point estimate
• Confidence levels
– The percentage probability of the interval containing the parameter
• 95% level most commonly used
• 5% is generally agreed to be an acceptable level of uncertainty
rate
probability of an event occurring per unit of time
hazard
instantaneous event rate
probability of an event at a particular time point
kaplan meier
cumulative incidence or 1-survival curve
tests survival
log rank test
compares survival distribution of 2 groups
takes the whole follow up period into account
[purely a test of significance
doesn’t estimate size of difference between groups
cox regression
hazard model
calculates hazard ratio and CI
hazard ratio
relative risk of an endpoint occuring at any one time
cox vs binary regression
COX REGRESSION (HAZARD RATIO) GIVES SAME OUTPUT AS BINARY REGRESSION
HOWEVER
HAZARD RATIO TAKES INTO ACCOUNT ALL OF THE POINTS ON THE SURVIVAL CURVE WHEREAS THE BINARY REGRESSION IS ONLY LOOKING AT THE RISK AT THE END OF THE STUDY
When would you use binary and when would you use cox regression?
- Some data sets patients take part for different time frames
- So, x patient 11 years and z patient 9 years so it gets complicated
- Hazard ratio tests at any one time point the probability of death so its good
- Binary regression which only looks at the end of the study the risk ratio won’t include that rich amount of data
You would probs use cox regression – when you have:
• Different entry points
• Long study
• Different follow periods
Subgroup analysis
- A special form of stratification
- Gives information about variation in efficacy/safety that can improve treatment decisions
- Hypothesis generating
Subgroup analyses problems and solutions
multiple comparisons - limit subgroups and adjust p value
underpowered - pre-plan subgroups
not the primary focus of randomisation so participants may be unbalanced - randomisation
risk of bias - pre-specify
overinterpreted