Survival Analysis II Flashcards

1
Q

What is survival analysis?

A

Statistical method used to analyse time-to-event data, estimating cumulative incidence and hazard functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why extend Kaplan-Meier analysis to regression models?

A

Regression models help adjust for multiple explanatory variables, including continuous ones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the most commonly used regression model for time-to-event data?

A

Cox Proportional Hazards Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the hazard rate?

A

Instantaneous failure rate at a given time, given that the event has not yet occurred
Accounts for the fact that rates change over time - turns into a continuous function. Hazard rate depends on time ‘t’, depicting a hazard at a specific time point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does the hazard function vary?

A

The hazard function can be constant, increasing, or decreasing over time, depending on the event (e.g., disease progression)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a hazard ratio (HR)?

A

Compares the hazard rates of two groups and is interpreted similarly to odds, risk, or rate ratios

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When is it reasonable to calculate an HR?

A

When the proportional hazards assumption holds, meaning the effect of each covariate on the outcome remains constant over time (proportional hazards assumption)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the key assumption of the Cox model?

A

Proportional hazards - HRs are constant over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the general form of the Cox model?

A

hi(t) = h0(t)e^β1xi1+β2xi2+…+βnxin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What do the terms represent in the Cox model?

A
  • hi(t): Hazard function for individual i (e.g., chances of dying at time ‘t’ dependent on baseline hazard function and covariates
  • h0(t): Baseline hazard function (can take any form and estimated from data - non-parametric)
  • x1, x2, …, xn: Covariates
  • β1, β2, …, βn: Estimated effects of covariates (assumed constant over time - proportional hazards assumption; parametric)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Scenario: What does an HR of 1.89 for women vs. men? Including age decreases the HR to 1.83, what does this mean? (failure = treatment failure)

A

Women had an 89% increased hazard of treatment failure compared to men
Adjusted HR for women decreases to 1.83, meaning age was a confounder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Scenario: What does an HR of 0.79 for age (per 10 years) indicate? (failure = treatment failure)

A

Each additional 10 years reduces the hazard of treatment failure by 21%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the two key assumptions in Cox regression?

A
  1. Non-informative censoring (censoring independent of event occurrence)
  2. Proportional hazards (hazards remain constant over time)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can proportional hazards be checked graphically?

A
  • Kaplan-Meier curves by groups
  • Log-log plots (parallel lines indicate proportional hazards)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a formal test for proportional hazards?

A

Schoenfeld residuals test in Stata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are two strategies when proportional hazards don’t hold?

A
  1. Stratified Cox regression (separate baseline hazard functions for groups)
  2. Introduce time-varying covariates (interaction between covariate and time)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How can follow-up time be split?

A

Divide into periods where hazards remain proportional (e.g., first 5 years vs. later periods)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why might time-varying covariates be needed?

A

Some variables (e.g., CD4 count, employment status) change over time and affect risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How are time-updated covariates handled in survival data?

A

Split records to reflect changes over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How is the baseline hazard function estimated in Stata?

A

stcox with basesurv(), basehc(), and basech() options

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How can the survival function be plotted?

A

stcurve, survival in Stata, optionally specifying covariate values e.g.:
stcurve, survival at1(age=4 basecd4=2.00) at2(age=4 basecd4=3.50)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Main takeaways:

A
  • Cox models provide HRs adjusting for multiple covariates
  • Proportional hazards assumption must be verified
  • Non-informative censoring is crucial
  • Time-varying covariates and stratification can handle non-proportional hazards
  • Baseline hazard function can be estimated and visualised
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the Kaplan-Meier method?

A

Cumulative incidence of time-to-event data, accounting for differing follow-up periods and the fact that not everyone experienced the event
Not easy with Kaplan-Meier method to have continuous explanatory variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Properties of Cox Proportional Hazard model:

A
  • Concerned with the hazard of an event occurring, dependent on the explanatory variables in the model
  • Regression coefficients from the Cox model are on the log scale
  • Exponentiate coefficient to obtain a HR (sometimes known as rate ratios or risk ratios)
  • Hazards need to be proportional
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does an incidence rate represent?

A

The propensity to experience an event, assuming the rate of events remains constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is a hazard rate?

A
  • The hazard rate/hazard function is the instantaneous rate of failure at a given time t, conditional on survival up to that time.
    It represents the likelihood that an event (e.g., failure, death) occurs in an infinitesimally small time interval, given that it has not yet happened.
  • The hazard rate can vary over time and is typically expressed as a continuous function ℎ(𝑡)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

In what circumstance may assuming a constant rate be insufficient?

A

Risk of death after a major operation - risk of death is highest immediately after surgery for the first 24 hours, followed by a plummet in the risk and a further increase
As we need to know how the rate varies, the hazard function is preferred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Definition of hazard function:

A

Hazard function at time, t, λ(t) or h(t):
h(t) = lim_δt->0{P(t ≤ T < t + δt|T ≥ t)} / δt
Where:
- Numerator: Probability of event happening given you’re still alive at time T and the event hasn’t occurred up to this point. The number of events in that small window divided by δt with time set to zero
- δt = PYs of follow-up
- Also known as hazard rate, instantaneous death rate, intensity rate, force of mortality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How can the hazard rate in two groups be compared?

A

HR e.g., how much higher is the hazard rate in one group compared to another group?

30
Q

What’s the issue with the proportional hazards assumption?

A

E.g., among newborns, the differences in the death rates between boys and girls is negligible, with the difference increasing later in life before easing off again. HR also depends on time t, but Cox regression can’t manage this

31
Q

In what ways is the Cox model flexible and in what ways is it not?

A

It lets the rate vary over time, but is strict in the proportional hazards assumption

32
Q

CHD: How would you understand a HR of 1.69 for heavy smokers and non-smokers?

A

This HR is assumed to be constant over time.
At any time point, the hazard of an event for heavy smokers is 1.69 times the hazard for a non-smoker.
The hazard rate (“risk”) of CHD event for smokers is increased by 69% (compared to non-smokers)

33
Q

Cox model and shape of the underlying hazard function:

A

The Cox model does not make any underlying assumptions about the shape of the underlying hazard function (can be a straight line, up, down etc.). It estimates the underlying hazard function from the data. It assumes proportional hazards for covariates - behaves more like a median than a mean as it doesn’t assume any distribution (non-parametric)

34
Q

Cox regression model (log scale)

A

log [hi(t)] = h0(t) + β1xi1 + β2xi2 + β3xi3 + … βnXin

35
Q

What type of model is the Cox regression model?

A

Semi-parametric

36
Q

Why is the Cox regression model known to be semi-parametric?

A

Baseline hazard function can take any shape, but we assume the HRs are fixed (one number for the whole follow-up period) and don’t depend on ‘t’

37
Q

What columns are required for survival analysis?

A

Time and event (failure)

38
Q

When first setting up survival analysis, what should you do?

A

Draw a Kaplan-Meier plot with a log-rank test (if applicable) to check for significant differences between groups

39
Q

What happens if the 95% CI crosses one?

A

The difference is not significant

40
Q

Scenario (HIV treatment failure): HR for women vs. MSM = 1.89 and HR for MSW vs. MSM = 1.41 - how to interpret?

A

Women have an 89% increased hazard of treatment failure (at any time during the follow up), compared to men who have sex with men (MSM) (hazard ratio [HR] = 1.89)
Men who have sex with women (MSW) have a 41% increased rate of treatment failure, compared to MSM (HR = 1.41)
These are unadjusted results…

41
Q

What needs to happen for a variable to be considered a confounder?

A

Associated with the IV and outcome

42
Q

How would we know if a variable is a strong confounder?

A

If the association between an IV and an outcome changes significantly

43
Q

Scenario: HR for women vs. MSM = 1.83, MSW vs. MSM = 1.45 (event: treatment failure), age adjusted - how to interpret?

A

After controlling for the effects of age, women have 1.83 times higher hazard of treatment failure compared to MSM

44
Q

Stata command for Cox regression:

A

stcox <predictor(s)>

45
Q

What command gives log odds in Cox regression?

A

stcox <predictor(s)>, nohr

46
Q

What do decisions on what covariates to include primarily depend on?

A

Clinical and theoretical knowledge of the subject

47
Q

What does unadjusted mean?

A

Univariable (looking at the effect of each variable separately)

48
Q

What are the implications of the Cox model being semi-parametric?

A

No assumptions made about the form of the baseline hazard
However, there are still assumptions:
- Non-informative censoring: censoring should be independent of event of interest
- Hazards are proportional over time for each stratum in the model

49
Q

How can you test proportional hazards assumption? (Graphical methods)

A

1a) Comparison of Kaplan-Meier estimates by group
1b) Plot estimates log cumulative baseline hazards for each group against time (curves should be parallel)
1) Plot minus the log cumulative baseline hazard for each group against log survival time (easier to judge if lines are parallel)

50
Q

Testing ph assumption through 1c)

A

Plot baseline hazard on a log scale against time (log of survival time)
Need to look for parallel lines
Command: stphplot strata(<strata>) adjust(<predictor(s)></strata>

51
Q

What does part of the command do for 1c)?

A

stphplot: Plots -ln{-ln(survival)} curves for each category of a norminal or ordinary covariate vs ln(analysis time).
These are often referred as “log-log” plots
adjust(): Presents curves adjusted to the average values

52
Q

How can you test proportional hazards assumption? Formal test

A

2a. Include an interaction between a covariate and a function of time - log(time) often used, but could be any function of analysis time
2b. Based on residuals e.g., Stata has a test based on a type of residual known as Schoenfeld residuals

53
Q

Interpreting results from 2a:

A

If HR for time is significant, time is a significant predictor of HR (HR must be changing over time).
A large p-value indicates assumption hasn’t been violated

54
Q

Interpreting results from 2b:

A

Looking to see if residuals are normally distributed - p-values should be greater than 0.05

55
Q

Stata command for testing ph assumptions through 2a:

A

stcox <predictor(s)>, tvc(<covariate>) texp(log(_t))</covariate>

56
Q

What do the parts of the Stata command for testing ph assumptions through 2a mean?

A

tvc (“Time-varying covariates”): Specifies those variables that vary continuously with respect to time i.e., introduces an interaction between covariate and time
texp: Specifies the function of analysis time that should be multiplied by the covariates

57
Q

Stata command for testing ph assumption through 2b:

A

stcox <predictor(s)>, schoenfeld(sch) scaledsch(sca)
estat phtest, log detail

58
Q

What do the parts of the Stata command mean for 2b?

A

estat: Post-estimation command
Need to include “detail” otherwise will only give the global test value

59
Q

How can you stratify the analysis when ph assumption is not met (for categorical variables)?

A

1) Fit separate models for each stratum
2) Allow baseline hazards by group to vary, but assume covariate effects are same across strata. This allows underlying hazard function differ and be non-proportional across groups

60
Q

When ph assumption not met: Method 1

A
  • Both baseline hazard rate and HRs vary by group
  • Will obtain a separate HR for each stratum
    e.g., fit separate Cox models for MSM, MSW and women
61
Q

When ph assumption not met: Method 2

A
  • Underlying baseline hazard is estimated for each stratum
  • Obtain one HR for each covariate
  • There should be no significant interactions between covariate and stratum variable
62
Q

Command for when ph assumption doesn’t hold (method 2)

A

stcox <predictor(s)>, strata(<stratum>)</stratum>

63
Q

Properties of non-informative censoring assumption

A

Those who drop out of the study are similar in characteristics and the chances of having the outcome
All-cause mortality: - Assumes end of study date not related to mortality
Incident disease ascertained from follow-up visits:
- Assumes drop out from study not related to incident treatment failure (frequently might not hold as those that drop out might have worse health)
- Administrative censoring (follow-up ends as database is closed for analysis) is usually non-informative

64
Q

What does it mean if censoring is informative?

A

Censoring is not independent of event of interest

65
Q

How is employment an example of a time-varying covariate?

A

E.g., a person is employed for 2 years then becomes unemployed. They have a CHD event 6 months later. Employment status is a time-varying covariate

66
Q

How to analyse employment as a time-varying covariate?

A

Person’s record needs to be split into 2 records in the data - one for when they are employed and one for when they are unemployed
E.g. emp status can be coded as 1 (employed) and 2 (unemployed) with a time and event indicator for each one

67
Q

Survival function definition:

A

Survival function at time t, S(t):
S(t) = P(T > t)
Where T is the time the event occurs
Note: h(t) = d/dt(logS(t))

68
Q

Cox regression model for survivor function

A

Si(t) = [S0(t)]exp(β1xi1+β2xi2+β3xi3+…+βnxin)

69
Q

What can we do to follow-up time if ph assumption doesn’t hold?

A

Split follow-up times e.g., first 5 years, next 5-10 years - if hazards proportional within these bands

70
Q

Alternative survival models if ph assumption not met:

A

Accelerated failure time model, proportional odds model and a Cox regression model with a time-dependent variable

71
Q

What is a competing risk?

A

Consider a study in which the outcome is time to death from an AIDs-related condition
Those that die from non-AIDS causes cannot then go on to experience the event of interest; it is a competing risk