Chapter 2 Flashcards
What are two key problems with studying mortality using idea of examining a set of lives over their lifetime?
(1) It takes too long: the experiment would take about 100 years to complete.
(2) In practice we would not be able to observe the deaths of all the lives in the sample – censoring. All we know about censored lives is that they died after a certain age.
Where is censoring quite low - what type of data
Medical statistics - non-parametric estimation is very important here
What are consequences of amending our observation experiment to be a shorter time frame and to have censoring
We no longer observe the
same cohort throughout their joint lifetimes so we will need to make some assumptions to ensure we are sampling from
the same distribution
Explain censoring with examples
Censoring is present when we do not observe the exact length of a lifetime but observe only that its length
falls within some interval. ex: Emigrating and leaving study, withdrawing consent, entering study late etc
Name some forms of censoring
Right censoring
Left censoring
Interval censoring
Random censoring
Non informative censoring
Type 1
Type 2
Explain right censoring
Data are right censored if the censoring mechanism cuts short observations in progress. E.g.: the ending of a mortality investigation before all the lives being observed have died.
Explain left censoring
Data is left censored if censoring mechanism prevent sus knowing when entry to the state we are observing happened ex: discovery of a medical condition,
patient fell ill with the disease at an unknown time t between appointments we just know the diagnosis date.
Explain interval censoring
Data is interval censored if the observational plan only allows us to say an event of interest fell in some interval ex: we only know the year of death. Right and left censoring are a special kind of interval censoring
Explain random censoring
Random censoring means time when observation of the ith lifetime is censored is a random variable Ci - Observation is censored if CI< Ti where Ti is the random lifetime of the ith life - special case of right censoring
Explain non informative censoring
Cnesoring is non informative if it gives no info about the future lifetimes. If {Ti} and {Ci} are independent then censoring is non informatvive - meaning censoring happens for a reason completely unrelated to the study. usually we make this an assumption
Explain Type 1 censoring
If censoring times are non in advance ie. variable Ci’s are constant, the mechanism is called Type 1 censoring. ex: You know the investiagtion might end on a certain date.
Explain Type 2 censoring
If observation continued until a predetermined number of deaths occur this is type 2 censoring and number of deatsh is not random - this is uncommon in mortality studies but common in medical studies.
Explain what the empirical distribution is and the Kaplan-Meier Estimator.
The empirical distribution of the survival function is known as the Kaplan Meier estimator of the distribution. The empirical distribution summarises all information in the data and is the best estimator of the distribution
What is another name for the Kaplan Meier estimator
Product limit estimator
What are the assumptions in the calculation of the kaplan meier estimator
- Censoring is non infromative
- Hazard of experiencing event is 0 at all durations other than where the event actually happens
- Hazard of experiencing the event at particular tj where event takes place is Dj/ Nj
- People censored are removed from stduy at duration which censroing takes place, or if at the time of an event, directly after the event
How can we compare distributions graphically?
Confidence intervals - see how much they oevrlap
What formula allows us to calculate the variance of our estimator - is it effective
Greenwoods formula - reasonable estimation over most t but tends to underestimate the variance in the tails of the distribution
Explain the nelson aaeln estimator and what is estimates
Another non parametric way to calculate the empirical distribution function based on non infromative cenosring, it estimates the intergrated hazard
What are three key assumptions when using maximum likelihood estimation
- we know the mathematical form of the survival function
- we assume the censoring is non informative
- we assume deaths are independent of one another
What is λj
the proportion of people dying at exact age tj
What is the Fleming-Harrington estimator
Estimate of survival function using the Nelson aalen estimator - Fleming-Harrington 𝑆 𝑡 = exp −Λ 𝑡
How can you use the Kaplan Meir estimator to determine if survival probabilities are the same as antoher experience
We know the variance of the Kaplan-Meier estimates, which is often estimated using
Greenwood’s formula. Hence we can draw confidence intervals about the estimate of the
survival function over the period for a group of policyholders. We know the
survival function of the typical policyholder, so if the former and its, say 95% confidence
interval falls outside the later curve then we
know that the two groups experience different mortality rates.
Or we could do simulation
Do random simulations of the numbers surviving the year based on the actual exposure
times/months. Estimate the prob of observing what we observed, assuming pop mortality, and then see if this is small (under 5%, say). If so, two experiences have different mortality at that confidence level