Mack - Chain Ladder Flashcards
Appeal of Confidence Intervals
- Estimated ultimate losses are not an exact forecast of the true ultimate losses
- They allow the inclusion of business policy/management philosophy by using a specific confidence probability
- They allow comparison between the CL and other methods
CL Ultimate
C_i,I = C_i,I+1-i * f_I+1-i * … * f_I-1
where f_k = SUM(C_j,k+1) / SUM(C_j,k) is the volume-weighted LDF
C_i,k is the cumulative loss amount
There are I AYs and development years
f_k is the age-to-age factor
Each increase from C_i,k to C_i,k+1 is considered a random disturbance of an expected increase from C_i,k to C_i,k * f_k, where f_k is an unknown true factor of increase which is the same for all AYs
First Implicit Assumption of the Chain-Ladder Method
E(C_i,k+1|C_i,1,…,C_i,k) = C_i,k * f_k
Expected losses in the next development period are proportional to losses to date (from the most recent development period - ignores prior)
Consequence: subsequent development factors are uncorrelated
Second Implicit Assumption of the Chain-Ladder Method
Losses are independent between AYs
Consequence: CL method cannot be used for triangles where CY effects affect several AYs in the same way
Third Implicit Assumption of the Chain-Ladder Method
Var(C_j,k+1 / C_j,k) = alpha_k^2 / C_j,k -> Var(C_j,k+1|C_j,1,…,C_j,k) = C_j,k * alpha_k^2
The variance of the losses in the next development period is proportional to losses to date with proportionality constant alpha_k^2 that varies by age
alpha_k^2 describes the spread of the individual age-to-age factors around the overall chain-ladder factor f_k
MSE of Ultimate
MSE(C_i,I) = Var(C_i,I|D) + (E(C_i,I|D) - C_i,I)^2 = Pure random error + Estimation error
Standard Errors for a Single Accident Year
Estimate MSE(C_i,I)
(s.e.(C_i,I))^2 = C_i_I^2 * SUM[(alpha_k^2 / f_k^2) * (1/C_i,k + 1/SUM(C_j,k))]
Calculate an MSE triangle and sum across each development period for the AY
alpha_k^2 = (1/I-k-1) * SUM[C_j,k * (C_j,k+1/C_j,k - f_k)^2]
SE for estimated ultimate losses is equal to the SE of the estimated reserves
Calculating alpha^2
Based on cumulative losses
- Calculate triangle of age-to-age factors and the volume weighted LDFs
- Calculate the triangle of cumulative losses * squared difference of age-to-age and vol wtd LDF
- C_i,k * (A-A_i,k - LDF_k)^2 - Calculate the alpha_k^2 for all except the last development period
- 1 / I - k - 1 * SUM(Differences in column k) - Calculate the last alpha^2
- min((alpha_I-2^2)^2 / alpha_I-3^2, min(alpha_I-2^2, alpha_I-3^2)) = min(previous^2/2nd previous, min(previous, 2nd previous))
Issue with alpha^2
Does not provide an estimator for alpha_I-1^2
Options:
1. Set alpha_I-1^2 = 0 - only do this if f_I-1 = 1 and development is expected to be finished after I - 1 years
2. Extrapolate the series of alphas until I-2 using log-linear regression
- alpha_I-1^2 = (alpha_I-2^2)^2 / alpha_I-3^2
3. Set alpha_I-1^2 = min[(alpha_I-2^2)^2/alpha_I-3^2, min(alpha_I-2^2, alpha_I-3^2)] - do this if you cannot nicely extrapolate the series using regression
Confidence Intervals for a Single Accident Year
Normal distribution: R_i +/- z * SE(R_i)
Lognormal distribution: R_i * exp(+/- z * sigma_i - sigma_i^2/2)
where sigma_i^2 = ln[1 + (SE(R_i)/R_i)^2]
Empirical Limits for a Single Accident Year
The lower empirical limit results from applying the minimum age-to-age factors for each development period to the losses-to-date
The upper empirical limit results from applying the maximum age-to-age factors for each development period to the losses-to-date
Standard Error of the Overall Reserve
R = R_1 + R_2 + … + R_I
Since the estimators R_i are NOT independent due to the influence of the same age-to-age factors f_k, there is positive correlation between AY reserve estimates
The overall MSE is greater than the sum of the individual AY MSEs
Testing Assumption 1 (Linearity)
Plot C_i,k+1 against C_i,k to see if we have an approximately linear relationship around a straight line through the origin with slope f_k
Testing Assumption 3 (Variance)
Plot weighted residuals against C_i,k to see if the residuals appear random - randomly scattered around y = 0
Weighted residual = (C_i,k+1 - C_i,k * f_k) / SQRT(C_i,k) = (Actual - Estimated) / SQRT(Latest Reported/Paid)
Should look at:
1. C_i,k+1 - C_i,k * f_k against C_i,k (C_i,k^2 weighted average, Var(C_j,k+1) proportional to 1)
2. (C_i,k+1 - C_i,k * f_k) / SQRT(C_i,k) against C_i,k (C_i,k weighted average, Var(C_j,k+1) proportional to C_j,k)
3. (C_i,k+1 - C_i,k * f_k) / C_i,k against C_i,k (simple average, Var(C_j,k+1) proportional to C_j,k^2)
If one plot is more random, consider replacing f_k with alternative average
Testing for Correlations Between Subsequent Development Factors (Assumption 1)
Based on a table of development factors corresponding to cumulative losses
- Rank each development period 𝑘 along with the rank of the prior development period 𝑘−1 in ascending order
- Calculate 𝑆_k,the sum of squared rank differences for each applicable development period 𝑘
- Calculate 𝑇_k, Spearman’s rank correlation coefficient for each applicable development period 𝑘
- T_k = 1 - 6*[S_k / (n_k^3 - n_k)] - Calculate the weighted average of the individual 𝑇_k to get our global test statistic T
- T = SUM[(n_k - 1) * T_k / (n_k -1)] - Construct a confidence interval for 𝑇
- +/- z * SQRT[2 / ((I - 2) * (I - 3))] - Interpret the confidence interval
- If 𝑇 lies OUTSIDE of the confidence interval, we REJECT the null hypothesis that subsequent development factors are uncorrelated
- If 𝑇 lies INSIDE of the confidence interval, we FAIL to reject the null hypothesis that subsequent development factors are uncorrelated
Assume 50% CI due to the approximate nature of the test and because we want to detect correlations already in a substantial part of the triangle
Advantages of Spearman’s rank
- The test is distribution free (doesn’t assume LDFs are normally distributed)
- Differences in variances of LDFs between development periods is less important because it uses ranks
Testing for Calendar Year Effects (Assumption 2)
Based on a table of development factors corresponding to cumulative losses
- Rank the development factors for each development period as “small (S)” or “large (L)” relative to the median. If a development factor is the median, then label it with an asterisk.
- For each diagonal 𝑗 starting with the second diagonal from the top left (j = 2):
- Count the number of 𝑆_j and 𝐿_j values
- Calculate Z_j = min(S_j,L_j)
- Calculate E(Z_j) = n/2 - (n-1)Cm * n / 2^n, where n = S_j + L_j and m = (n-1)/2 (truncate if fraction)
- Calculate Var(Z_j) = (n(n-1)/4) - ((n-1)Cm * n(n-1)/2^n) + E(Z_j) - E(Z_j)^2 - Calculate test statistic Z
- Z = Z_2 + Z_3 + … + Z_I-1 - Calculate E(Z) and Var(Z) as the sum of the E(Z_j) and Var(Z_j)
- Construct a CI for Z
- E(Z) +/- z * SQRT(Var(Z)) - Interpret the CI
- If 𝑍 lies OUTSIDE of the confidence interval, we REJECT the null hypothesis that losses are independent between accident years
- If 𝑍 lies INSIDE of the confidence interval, we FAIL to reject the null hypothesis that losses are independent between accident years
Examples of CY influences
- Reserve strengthening or weakening
- Changes in payment processes
- Changes in inflation
Disadvantage of the Mack method
Doesn’t tell us about the shape of the loss reserve distribution - only gives us the mean and standard deviation