Chapter 7 Flashcards
Define a covariate
A covariate is any quantity recorded in respect of each life
such as age, sex, smoker or not, type of treatment, etc
When do we use non parametric estimates
If the covariates partition the population into a small
number of homogeneous groups then non-parametric (e.g., Kaplan-Meier estimates) can be compared
What is a more direct and transparent way to construct a model other than non parametric methods
A regression model
Give example of how what covariates can be measured by
- direct measurements
- indicator variable
- quantitative interpretation of a measurement ex: scale
What does Zi denote
Vector of covariates
What is the most widely used regression model
Proportional hazards (PH) model. This model is also known as the Cox model after its originator. It can
help us to identify the factors that influence the relative
levels of mortality between members of a population.
How do covariates act on the baseline hazard
Multiplicatively
Assuming the ith covariate takes only positive values then if the ith regression parameter is positive what is the effect on the hazard
If the ith regression parameter is positive the hazard rate
increases with the ith covariate. Greater the magnitude of the parameter the greater the effect
Assuming the ith covariate takes only positive values then if the ith regression parameter is negative what is the effect on the hazard
If the ith regression parameter is negative the hazard rate
decreases with the ith covariate. Greater the magnitude of the parameter the greater the effect
Under the cox models how does one compare hazards of different lives
Under the Cox model the hazards of different lives with
covariate vectors z1 and z2 are in the same proportion at all
times so the general shape of the hazard is determine by the baseline hazard while the exponential terms account for the differences between lives
Why is the Cox model termed a semi-parametric approach
We are not primarily concerned with the precise form of the baseline hazard but with the effects of the covariates
we can ignore λ0(t) and estimate β from the data irrespective of the shape of the baseline hazard – this is
termed a “semi-parametric” approach
what is S/e of Beta
Often s fitted Cox model will report the estimate of each β from the data and also the standard error. The standard error is the standard deviation of the sampling distribution of that parameter
What is significant about the estimates of Beta
As the sample size on which the estimate of β is based tends to infinity the central limit theorem implies that the sampling distribution of the mean (estimate of Beta) is asymptotically
Normal
How do we ensure beta is significant
Examine the confidence interval - if 0 is included int he interval it is not significant meaning that covariate has no effect on the hazard
How does one estimate Beta
To estimate β it is usual to maximise the partial
likelihood. Note that the baseline hazard cancels out (hence partial) and the partial likelihood depends only on the order in which deaths are
observed
Describe the partial likelihood in words
Force of mortality jth life/ force of mortality all lives : aka the partial likelihood gives the comparative risk of
a particular individual dying, given that a death occurs
Whats the difference between the Kaplan Meiers method and the Cox method
Unlike the Kaplan-Meier method the partial likelihood
considers observed deaths only, not the times at which the
deaths occurred nor any censoring observed between deaths
What are two ties that can be present int he data preventing the maximisation of the partial likelihood
(a) some dj > 1 or
(b) some observations are censored at an observed
death time
How do we combat problem when some observations are censored at an observed death time
By Including the lives on whom
observation was censored at time tj in the risk set R(tj)
effectively assuming that censoring occurs just after the
deaths were observed.
How do we combat problem where dj>1 and we can’t maximise the partial likelihood
Use Breslow’s approximation
How do you maximise the likelihood
Find the log likelihood then differentiate this with respect to beta.
set equal to zero and solve for Beta estimate
What is the likelihood ratio statistic a criteria for
Suppose we need to assess the effect of adding further covariates to the model. The effect these have on the model
What are we assuming in calculation of the likelihood ratio statistic
In general, suppose we fit a model with p covariates and another model with p+q covariates. Each is fitted by maximising a likelihood. Let lp and lp+q be the maximised log-likelihoods. We assume the hypothesis that the extra q covariates have no effect in the presence of the original p covariates
With the test statistic as the likelihood ratio what is our test
H0:βp+1 = βp+2 =…= βp+q = 0
Test statistic is likelihood ratio stat - H0 is rejected at the 5% significance level if the value of this is greater than the upper 5% point of Chi squared q degrees of freedom
In a life assurance firm if a life does not meet the premium basis what are the options?
decline insuring them
Charge something reasonable above/below the standard premium (can sue this formula to calculate!)
How is the likelihood ratio statistic the basis for various modelling building strategies
We start with null model (one with no covariates) and
add possible covariates one at a time or
We start with a full model which includes all possible
covariates and then try to eliminate those of no
significant effect.
What else should we test for in relation to covariates in a model
Interactions - may have an additive effect as well as a multiplicative effect
Why does a life insurance company wish to know how certain covariates affect mortality
So that it can charge
premiums that accurately reflect the risk for an individual
State the proportional hazards model
A survival time T follows a Cox proportional hazards model if the hazard function
(force of mortality) for the i
th life with covariate zi = (zi1,zi2,….zip) can be written as:
…
Why is the PH model so widely used?
This model is widely used when comparing the impact of different covariates on
mortality.
Shape of the hazard is determined by the baseline hazard while the exponential
term accounts for the differences between lives. If we know the β’s then we can
compare the lives.
The covariates act multiplicatively on the baseline hazard. [1]
If we are not primarily concerned with the precise form of the baseline hazard [1]
but with the effects of the covariates we can ignore λ0(t) and estimate β from the data
irrespective of the shape - a “semi-parametric”
approach
It is straightforward to estimate the (partial) likelihood from data as the baseline hazard cancels out and the partial likelihood depends only on the order in which deaths are
observed.