Survival Analysis Flashcards
What is survival analyis?
- set of statistical methods for analyzing data where the outcome variable is the time until the occurrence of an event of interest
- events could be death, occurrence of disease, married, divorced, etc.
- time can be measured in days, weeks, years
When would “time-to-event” be particularly interesting?
- time until tumour recurrence
- time until cardiovascular death after treatment
- time until AIDS for HIV+ patients
- time until medical school admissions, etc. (other life events)
How are time-to-event studies normally carried out?
- prospective cohort studies
- participants followed over a specified period of time and the focus is on the time at which the event of interest occurs
What is time-to-event?
-variable measuring the elapsed time from a particular starting point to a particular event
What makes survival analysis different?
- survival times are positive numbers (always skewed)
- the probability of surviving past a certain point in time may be of more interest than the “expected” time of event
- the hazard function, used for regression in survival analysis, lends more insight into the “picture” of failure mechanism (can see how people move through the study)
- censoring (some people will reach the end and not experience the event, drop out, or withdraw- “censored”)
What is censoring?
- a particular type of missing data
- observations are called “censored” when the information about their survival time is incomplete
- typically we see “right censoring”
- information about how long they truly survived since we started watching them is unknown
- means that participant does not have event before study ends, lost to follow up, withdraws
- means they did not have the event while observed but we do not know what happened afterwards
What is right censoring?
- people who make it without experiencing the event
- assume that this person’s survival is at least to be as long as the duration of the study

What is the dependent variable in survival analysis?
-time-to-event and event status (event or no event)
What is survival function?
- for every time, the probability of surviving (not experiencing the event) up to that time
- probability of surviving past a given time period
of subjects survivng past a given time period/# of subjects in the study at the start of the time period
- as t ranges from 0 to infinity, survival function never increases (survival can’t be more than 1)
- at t=0, probability of survival is 1 S(t)=1
- at t=infinity, S(infinity)=0
What is the hazard function?
-the potential that the event will occur, per time-unit, given that an individual has survived up to that specified time
Estimate the survival functions and interpret them

S(t)2001= 0.9: 0.9 probability of surviving through 2001
S(t)2002= 0.83: if you survive up until beginning of 2002, 0.83 probability of surviving until the end of 2002
S(t)2003= 0.63 “ “
S(t)2004= 0.11
S(t)2005= 0
If you have a 90% chance of surviving to the end of 2001 and an 83% chance of surviving to the end of 2002, what is your cumulative probability of surviving to the end of 2002?
- 9 x 0.83= 0.747
- conditional probability; in order to survive until end of 2002 you have to survive until end of 2001
-cumulative gives probability of surviving both time periods

What are assumptions and limitations of a survival curve?
- assumes everyone is recruited at the start of 2001 and followed-up until they die
- in reality, people are recruited at different times so we calculate survival time from recruitment until event occurs or subject is censored
- reframe starting points to be the same and see how long people survived past year 0 even if they were recruited at year 2 - if a subject survives to the end of a period then she is considered to survive past that period (ex: calendar year of 2007 ends on Dec 31st 2007- subject is assumed to survive past the period and be alive in 2008)

How are survival times adjusted for people entering the study late and only having for example 1 year to be followed up?
-when calculating probability of surviving past 1 year follow up, add all participants who have at least 1 year in study
S(t) 1 year= 0.525
-for year 2, remove people who only entered with 1 year of possible follow up
S(t) 2 years= 0.461
S(t) 3 years= 0.655

How can this data be interpreted in terms of time-to-event?

- event: pregnancy
- outcome: time to pregnancy
- non-smoking women get pregnant quicker so after any cycle more smokers will not be pregnant

What is used to determine if two curves are different?
- log rank test
- p value less than 0.05 means statistically significant
What do survival curves tell us?
- at any given point in time, what is the difference in event rate between two groups of interest
- if the curves lie on top of each other, there is no difference in rate of events between groups
- use this to calculate hazard ratio (interpreted the same was as other ratios)
How can we determine if the two survival curves are different from each other?
- statistically compare them using log rank test
- if p value is less than 0.05 then the curves are different
What is cox regression? Hazard function?
- way to estimate the association of independent variables and time-to-event
- hazard function: probability of an event occuring at time t, given that the individual has survived to time t
What are the num and denom for hazard function?
- instantaneous rate of occurrence of an event
- numerator: conditional probability that the event will occur in the interval given that it has not occured before
- denominator: the width of the interval
- dividing one by the other gives us rate of event occurrence per unit of time
What are assumptions made for cox regression?
- hazard functions are proportional; at any point in time, the difference between the curves are proportional (have to say the risk is true across all points in time)
- when the curves cross, violation of proportional hazards model (then it can’t be used)
- when there is an effect of time, violates proportional hazards model (over time, the width of difference of curves increases- can test for this using Schoenfeld residuals)

What happens if we violate proportional hazards?
- conduct regression stratified by time
- conduct regression separately by group
- include an interaction term for time x group
How are drop outs accommodated for?
- life table: based on predetermined study periods (eg. per cycle, per year)
- KM: each event starts a new period (KM maximizes use of information because the data are used to define the periods)
- KM has become preferred method in literature
**go through slideshow examples to calculate parts of each table
What can be observed in this survival curve?

- the two surgery options typically follow the same curve until approx 2 years after diagnosis
- minimally invasive surgery survival is worse than open surgery
- HR is 1.65: 65% more likely to die within 4 years of diagnosis with the minimally invasive procedure