Unit 4: Survival Analysis Flashcards
What is Survival Analysis?
Survival Analysis, or Time-to-Event Analysis, involves studying the time until an event of interest occurs
What are two common characteristics of Survival Analysis data?
- Waiting times (i.e., time-to-event) generally have a right-skewed distribution
- Some observations are incomplete, or censored - in other words, the event of interest is not observed during the study period
Define censored data
incomplete observations do not experience the event during the study period.
What are the three types of censored data?
- Left-censoring
- Right-censoring
- Interval censoring
(Censored observations can still contribute to the estimates of survival – up until the time at which they are censored)
Define right-censored data
Right-Censoring: the event occurs after the last time the subject is observed
Define left-censored data
Left-Censoring: the event occurs before the first time the subject is observed
Define interval censoring
Interval Censoring: the event occurs between two observation times
What is the survival function?
S(t) = Pr (T > t)
The survival function provides the probability that an individual will survive beyond a given time t - in other words, the probability that an
individual does not experience the event of interest until after time t.
What is the Kaplan-Meier Estimator?
An estimator of the survival function that is specifically designed for right-censored data
and assumes uninformative censoring.
(Censored observations can still contribute to the estimates of survival – up until the time at which they are censored)
This is also known as the product-limit method.
It is non-parametric
What does it mean to use nonparametric survival analysis techniques?
It means that in nonparametric survival analysis methods, we do not make any assumptions about the distribution of the outcome of interest (i.e. Chisquare, T, Normal, etc)
What 3 types of missing data might a study have?
Types of Missing Data:
1. Missing completely at random (MCAR) *
the probability that predictor-value is missing is unrelated to the predictor-value or the value of any other variables
2. Missing at random (MAR)
the probability that predictor-value is missing is unrelated to the predictor-value after controlling for another variable
3. Missing not at random
What are 4 approaches to missing data?
Missing Data Approaches:
- Complete-case analysis
- Available-case analysis
- Mean substitution
- Imputation (typically for missing Y)
What test can you use to compare survival curves for different groups of observations (e.g. treatment vs placebo)
The Log Rank Test
Hypotheses:
H0: S1(t) = S2(t)
H1: S1(t) not equal to S2(t)
*Use SAS Proc Lifetest for log rank test
How can you check the proportional hazards assumption?
The lines in the LLS (Log-log) plot do not cross
What is the proportional hazard regression?
The PH regression model is a survival analysis technique that can account for multiple covariates. The PH regression model allows for the interpretation of a single variable, controlling for the effects of the other covariates in the model.
The validity of the proportional hazards regression relies on the condition that the hazards are, in fact, proportional among the groups - this can be checked by looking at the correlation of the schoenfeld residuals
What is the hazard ratio?
The hazard ratio is a factor that compares the rate of event occurrence between two groups. The null value 1 indicates the hazard is the same for the groups of interst.
How can you obtain the hazard ratios?
Exponentiate the parameter estimates
When calculating survival within first x years, what do you do?
1-p(x)