Stats questions Flashcards

Question 1

Q

How would you handle missing data?

Answer

A

-Exclude from subgroup analysis

-Missing values for the risk factors were imputed with multiple imputation using chained equations, creating 10 imputed datasets, and Rubin’s rules were used to combine the model estimates across the datasets. The imputation model included all risk factors and outcomes used in the analyses, including time to death/censoring.
- A further limitation of using routinely collected national data is the impact of missing data. Although generally data completeness was good, there were around 16% of patients with no recorded performance status and 15% with missing data regarding pretreatment staging. The proportion of patients with missing pretreatment staging was higher in the IBD group (20.8% vs. 14.6%). A multiple imputation model including all risk factors and outcomes was used to maximize the analysis cohort and reduce bias.

Question 2

Q

What are the implications of incomplete data linkage?

Question 3

Q

What are some potential confounders?

What might you include in logistic regression and competing risk models?

Answer

A

Gender
Age (in years, modelled as age and age2)
IMD quintile
Charlson comorbidity score (0, 1, ≥2)
ECOG performance status
Tumour site
Admission type
American Society of Anesthesiologists (ASA) score
TNM/pathological stage
The interaction between age and distant metastases (also modelled in years as age and age2).