Chapter 3 Flashcards
What is N in the binomial model
The individuals subject to death within a given time frame
What is q in the binomial model
Probability of event of death occuring in each case
What is N called and denoted by
N is the initial exposed to risk and is denoted E
What is the ratio Q hat called?
Crude occurance rate or crude mortality rate
What are the qualities of the crude occurance rate estimator q hat
It is unbiased
It is the MLE of q
It is a consistent estimator of q
It is an efficient estimator of q meaning its variances matches the cramer rao bound
What can the sampling distribution be written as if N is very large when we have a number of deaths ~ binomial model
We can use Normal distribution and can find confidence intervals of Q hat
Explain the central exposed to risk between x and x+1
Measure the length of time in years each individual is exposed to this risk and sum over all individuals to get the total.
What si the max length of time a person can be exposed to risk of death between x and x+1
1 year
Explain person years of exposure?
Same as central exposed to risk just usually called this in medical stsatistics
Explain the assumptions in the posson model
Assuming the rate of occurrence is the same for everyone and it acts on each life independently. Then the total number of deaths at age x is distributed by a Poisson distribution, by summing the independent Poisson for each life and its length of exposure
What are the properties of the estimator mew(x+0.5) hat
Unbiased,
Maximum likelihood estimator
Consistent estimator
Efficient estimator
Explain Consistent estimator
If at the limit n → ∞ the estimator tend to be always right (or at least arbitrarily close to the target), it is said to be consistent
Explain efficient estimator
The most efficient estimator among a group of unbiased estimators is the one with the smallest variance
Comparing central exposed to risk and inital exposed to risk
Central exposed to risk is easily observable - just record the time each life was observed , carries through unchanged to multiple state models
Initial exposed to risk requires adjustments in respect of lives who die
Which is easier to calculate central exposed to risk and inital exposed to risk
central exposed to risk
What is mx
Probability a life between x and x+1 dies
Explain what the homogeneity assumption means
Usually we assume we observe groups with the same mortality characteristics or homogeneous groups with identifical lives and death rates in mortality studies. In reality this is near impossible
What balance needs to be struck when organising groups for mortality studies
Balance between subdividing groups to achieve increased homogeneity and the resultant group being too small that they lack volume for statistically significant results. Sample error would be too big
Give some common factors used to subdivide data for mortality studeis
Sex, age, type of policy, smoker, level of underwriting, duration of policy, occupation, BMI, impairments
What is another factor that limits subdivision abilities
Actually collecting the data to allow for subdivision is difficult
State the principle of correspondence
A life alive at time t should be included in the exposure at age x at time t f and only if, were that life to die immediately, they would be counted in the death data dx age x
Explain the idea of the principle of correspondence
Data deaths and exposure should be defined consistently or the ratio d/n is meaningless
What are the assumptions in calculation of central exposed to risk with ideal data if we had it all
Independent and identically distributed with respect to mortality lives!
Non informative censoring
Model specific assumptions that may apply
Why can we not calculate the central exposure to risk exactly -
in reality we do not have all the information we would need. We would need birth, death dates of everyone and when they entered and left the study
Define dx in context of calculating approximation of the central exposed to risk
Number of deaths age x last birthday during calendar years K, K+1…..
Define Px,t in context of calculating approximation of the central exposed to risk
Number of lives under observation aged x last birthday at time t where t=1 January in calendar years K , k+1, …, K+N, K+N+1
What approximation do we used to find central exposed to risk and why?
Trapezium rule - we only have avalue of Px,t at certain points so we cannot find the area under the plotted points by integrals. Instead we assume linearity between census dates and use trapezium rule
What data determines the correct rate interval to use
Death data carries the most information so it determines wich rate interval to use
What is a rate interval
A different year of age studied as defined by age nearest birthday, age next birthday, age last birthday.
Define d^2 x
Total number of deaths age x nearest birthday during calendar years K, K+1,…, K+n
Define d^3 x
Total number of deaths age x next birthday during calendar years K, K+1,…, K+n
What is the rate interval when we define x as age last birthday
[x,x+1)
What is the rate interval when we define x as age nearest birthday
[x-1/2,x+1/2)
What is the rate interval when we define x as age next birthday
[x-1,x)
What is the census corresponding to the interval [x-1,x)
P^3 x,t is Number of lives under observation age x next birthday at time t where t=1 is january in calendar year K, K+1,…, K+n, K+n+1
What is the census corresponding to the interval [x-1/2,x+1/2)
P^2 x,t is Number of lives under observation age x nearest birthday at time t where t=1 is january in calendar year K, K+1,…, K+n, K+n+1
Why do we adjust the age definition in the census data to
correspond with that of the deaths data
To ensure that they follow the principle of correspondence,
which states that a life alive time t should be included in the exposure at age x at time t if and only if, were that life to die immediately, he or she would be counted in the death data at age x.
The deaths data “carry most information” when mortality rates are small, so we adjust the census data not the deaths data.
What assumptions are made using the trapezium estimate to calculate central exposed to risk
We need to assume that births are uniformly distributed across the calendar year.
To use the trapezium rule we must assume that the population varies linearly between census dates.
We assume that the population enumerated in the census of 1 January 2015 can be taken to be the population at the end of the calendar year 2014
How could you estimate death rates at age x nearest birthday for future years
To compute the population aged x last birthday some kind of forecasting/modelling will be required
Extrapolation of the linear change at each age x between
dates you know is one option
Can adjust census data to match the death data
Two reasons why those with a life assurance policy tend to have lower mortality than
those without such a policy:
Because of self-selection: proposers for assurance policies are generally
wealthier and /higher educated than the general population.
* Because of temporary initial selection: because policyholders have passed the
underwriting criteria of the life assurance company when they originally took out
the policy, so are healthier on average than the rest of the population.
What statistical problem can arise when data is sub divided
We can subdivide until the data is too small for the mortality investigation to have
statistical significance. So there is a tension in sub-dividing to reduce heterogeneity in
mortality rates and in achieving groups with sufficient data (number of deaths) to produce
statistically significant estimates of the mortality rates.