Shapland & Leong Flashcards
Provide two advantages of bootstrapping.
⇧ They allow us to calculate how likely it is that the ultimate value of the claims will exceed a certain amount ⇧ They are able to reflect the general skewness of insurance losses
Provide one disadvantage of bootstrapping.
They are more complex than other models and more time consuming to create
Describe how using the over-dispersed Poisson model to model incremental claims relates a GLM to the standard chain-ladder method.
If we start with the latest diagonal and divide backwards successively by each age-to-age factor, we obtain fitted cumulative claims. Using subtraction, the fitted cumulative claims can be used to determine the fitted incremental claims. These fitted incremental claims exactly match those obtained using the over-dispersed Poisson model
Briefly describe three important outcomes from this relationship. (ODP & CL)
⇧ A simple link ratio algorithm can be used in place of the more complicated GLM algorithm, while still maintaining an underlying GLM framework ⇧ The use of the age-to-age factors serves as a bridge to the deterministic framework. This allows the model to be more easily explained ⇧ In general, the log link function does not work for negative incremental claims. Using link ratios remedies this problem
Fully describe the bootstrapping process. Assume the data does not require any modifications.
⇧ Calculate the fitted incremental claims using the GLM framework or the chain-ladder ageto- age factors ⇧ Calculate the residuals between the fitted incremental claims and the actual incremental claims ⇧ Create a triangle of random residuals by sampling with replacement from the set of non-zero residuals ⇧ Create a sample incremental triangle using the random residual triangle ⇧ Accumulate the sample incremental triangle to create a sample cumulative triangle ⇧ Project the sample cumulative data to ultimate using the chain-ladder method ⇧ Calculate the reserve point estimate for each accident year using the projected data ⇧ Iterate through this process to create a distribution of reserves for each accident year
Identify the assumptions underlying the residual sampling process.
The residual sampling process assumes that the residuals are independent and identically distributed. However, it does NOT require the residuals to be normally distributed
Explain why these residual sampling assumptions are advantageous.
This is an advantage since the distributional form of the residuals will flow through the simulation process
Briefly describe two uses of the degrees of freedom adjustment factor.
⇧ The distribution of reserve point estimates from the sample triangles could be multiplied by the degrees of freedom adjustment factor to allow for over-dispersion of the residuals in the sampling process ⇧ The Pearson residuals could be multiplied by the degrees of freedom adjustment factor to correct for a bias in the residuals
Identify a downfall of the degrees of freedom adjustment factor and state how the issue can be remedied.
⇧ The degrees of freedom bias correction does not create standardized residuals. This is important because standardized residuals ensure that each residual has the same variance. In order to calculate the standardized residuals, a hat matrix adjustment factor must be applied to the unscaled Pearson residuals
Discuss the difference between bootstrapping paid data and bootstrapping incurred data.
Bootstrapping paid data provides a distribution of possible outcomes for total unpaid claims. Bootstrapping incurred data provides a distribution of possible outcomes for IBNR
Explain how the results of an incurred data model can be converted to a paid data model.
To convert the results of an incurred data model to a payment stream, we apply payment patterns to the ultimate value of the incurred claims
Explain the benefit of bootstrapping the incurred data triangle.
Bootstrapping incurred data leverages the case reserves to better predict the ultimate claims. This improves estimates, while still focusing on the payment stream for measuring risk
Identify one deterministic method for reducing the variability in the extrapolation of future incremental values.
Bornhuetter/Ferguson method
Explain how this method can be made stochastic. (Reducing variability in the extrapolation of future incremental values)
In addition to specifying a priori loss ratios for the Bornhuetter/Ferguson method, we can add a vector of standard deviations to go with these means. We can then assume a distribution and simulate a different a priori loss ratio for every iteration of the model
Identify four advantages to generalizing the over-dispersed Poisson model.
⇧ Using fewer parameters helps avoid over-parameterizing the model ⇧ Gives us the ability to add parameters for calendar year trends ⇧ Gives us the ability to model data shapes other than triangles ⇧ Allows us to match the model parameters to the statistical features found in the data, and to extrapolate those features
Identify two disadvantages to generalizing the over-dispersed Poisson model.
⇧ The GLM must be solved for each iteration of the bootstrap model, which may slow down the simulation ⇧ The model is no longer directly explainable to others using age-to-age factors
Provide a disadvantage to including calendar year trends in the over-dispersed Poisson model.
By including calendar year trends, the system of equations underlying the GLM no longer has a unique solution
Explain how this problem can be remedied. (CY trends in the ODP)
To deal with this issue, we start with a model with one Alpha parameter, one Beta parameter and one Correlation parameter. We then add and remove parameters as needed
Negative incremental values can cause extreme outcomes in early development periods. In particular, they can cause large age-to-age factors. Describe four options for dealing with these extreme outcomes.
⇧ Identify the extreme iterations and remove them • Only remove unreasonable extreme iterations so that the probability of extreme outcomes is not understated ⇧ Recalibrate the model • Identify the source of the negative incremental losses and remove it if necessary. For example, if the first row has negative incremental values due to sparse data, remove it and reparameterize the model ⇧ Limit incremental losses to zero • This involves replacing negative incremental values with zeroes in the original triangles, zeroes in the sampled triangles OR zeroes in the projected future incremental losses. We can also replace negative incremental losses with zeroes based on their development column ⇧ Use more than one model • For example, if negative values are caused by salvage/subrogation, we can model the gross losses and salvage/subrogation separately. Then, we can combine the iterations assuming 100% correlation
ODP: Explain why the average of the residuals may be less than zero in practice.
If the magnitude of losses is higher for an accident year that shows higher development than the weighted average, then the average of the residuals will be negative
ODP: Explain why the average of the residuals may be greater than zero in practice.
If the magnitude of losses is lower for an accident year that shows higher development than the weighted average, then the average of the residuals will be positive
Discuss the arguments for and against adjusting the residuals to an overall mean of zero.
⇧ Argument for adjusting the residuals • If the average of the residuals is positive, then re-sampling from the residuals will add variability to the resampled incremental losses. It may also cause the resampled incremental losses to have an average greater than the fitted loss ⇧ Argument against adjusting the residuals • The non-zero average of the residuals is a characteristic of the data set
If the decision is made to adjust the residuals to an overall mean of zero, explain the process for doing so.
We can add a single constant to all residuals such that the sum of the shifted residuals is zero
Describe the process for using an N-year weighted average of losses when determining development factors under the following frameworks: a) GLM framework
We use N years of data by excluding the first few diagonals in the triangle (which leaves us with N +1 included diagonals). This changes the shape of the triangle to a trapezoid. The excluded diagonals are given zero weight in the model and fewer calendar year parameters are required
Describe the process for using an N-year weighted average of losses when determining development factors under the following frameworks: b) Simplified GLM framework
First, we calculate N-year average factors instead of all-year factors. Then, we exclude the first few diagonals when calculating residuals. However, when running the bootstrap simulations, we must still sample from the entire triangle so that we can calculate cumulative values. We use N-year average factors for projecting the future expected values as well
Briefly describe three approaches to managing missing values in the loss triangle.
⇧ Estimate the missing value using surrounding values ⇧ Exclude the missing value ⇧ If the missing value lies on the last diagonal, we can use the value in the second to last diagonal to construct the fitted triangle
Briefly describe three approaches to managing outliers in the loss triangle.
⇧ If these values occur on the first row of the triangle where data may be sparse, we can delete the row and run the model on a smaller triangle ⇧ Exclude the outliers completely ⇧ Exclude the outliers when calculating the age-to-age factors and the residuals, but re-sample the corresponding incremental when simulating triangles