Duration Models Flashcards

1
Q

Why duration models?

A

Because they take into account time and duration. Timing is important, question is often not “if” it is going to happen. But “when” it is going to happen.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Two relating, interesting questions:

A

1) Analysis and prediction of when an event will happen? Or whether it happens at all.
2) Analysis of the effects of covariates on whether and when the event happens.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Duration Data

A

Time it takes for the event of interest to happen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Hazard rate

A

The probability that an event occurs at time period t, conditional that it did not happen yet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When is a observation censored?

A

If the event did not take place in the observed time period, it means that that observation is censored.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do we know about censored observations:

A
  • DO KNOW: the event did not happen within the observation period
  • DON’T KNOW: if and when the event will happen
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why use censored data?

A

Models need to use as much information as possible, censored observations are not missing at random and they contain essential information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Hazard model

A

A model for the hazard rate, which is the probability that an event happens in a time interval given that it has not happend yet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Building blocks of the hazard model:

A
  • Probability density function
  • Cumulative distribution function
  • Survival function
  • Hazard rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Probability density function

A

Prob. that event happens in the time interval t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Cumulative distribution function

A

Prob. that the even takes place AT or BEFORE time t.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Survival function

A

Probability that event not happens before time t.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Hazard rate

A

Conditional prob. that even occurs at t, given that it has not occurred until t.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Kaplan Meier Survival Function

A

Uses the survival function (S)t: probability that the event did not happen till time period t.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Kaplan-Meier estimator

A

Non-parametric estimator directly computed from the observed proportions of surviving cases (over-time periods)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cox Proportional Hazard model

A

Model for the Hazard rate, allows for including:

  • fixed covariates
  • time-varying covariates
17
Q

Cox Proportional Hazard model, formula:

A

h(t) = h0(t)*exp(B1X1 + ….. + BkXk)

18
Q

Why is Cox Proportional Hazard model popular:

A

1) Partial likelihood approach, no need to specify hazard baseline function
2) No need to specify the probability distribution
3) Very easy & fast to estimate

19
Q

Parameter interpretation via…

A

The hazard ratio, which is the ratio between two hazard rates.

20
Q

Example of a hazard ratio, in words and formula..

A

A model for churn and X is a dummy variable for gender ( 1 = female | 0 = male):

Hazard rate = Hfemale(t) / Hmale(t) = e^Bgender

21
Q

If parameter is corresponding to B, then the hazard rate increases by:

A

100*(exp(B)-1) %

22
Q

Key assumption of the proportional hazard model

A

Proportionally assumption:

  • assumption that the hazards for different levels of covariates is constant overtime.
  • in other words, the hazard ratio of males vs. females is constant overtime.
23
Q

Model fit and selection, via:

A
  • Likelihood ratio test
  • Wald-test
  • Score test
  • Pseudo R2
24
Q

Index of Concordance

A

The fraction of pairs in your data, where the observation with the higher survival time, has the higher probability of survival predicted by your model.

25
Q

Pseudo R2

A

Percentage of improvement in log-likelihood of the full model, relative to the null model.