Shapland Flashcards

1
Q

Property of the over-dispersed Poisson model

A

the fitted incremental claims will exactly equal the fitted incremental claims derived using the standard chain-ladder factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Advantage of the ODP bootstrap model

A

Although sampling with replacement assumes the residuals are independent and identically distributed, it does not require the residuals to be normally distributed.
This allows the distributional form of the residuals flow through the simulation process. (this is sometimes referred to as a ‘semi-parametric’ bootstrap model since we are not parameterizing the residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to include process variance in the future incremental claims

A

We assume that each future incremental claims follows gamma distribution.
This revised model incorporates process variance and parameter variance in the simulation of the historical and future data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Approach 1 for modeling an unpaid loss distribution using incurred data

A

We run a paid data model in conjunction with the incurred data model.
Then we use the random payment pattern from each iteration of the paid data model to convert the ultimate values from each corresponding incurred model iteration to develop paid losses by AY.
Advantage: it allows us to use the case reserves to help predict the ultimate losses, while still focusing on the payment stream for measuring risk
An improvement to this approach would be the inclusion of correlation between the paid and incurred models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Approach 2 for modeling an unpaid loss distribution using incurred data

A

Apply the ODP bootstrap to the Munich chain-ladder (MCL) model. The MCl uses the inherent relationship/correlation between the paid and incurred losses to predict ultimate losses.
When paid losses are low relative to incurred losses, then future paid loss development tends to be higher than average. When paid losses are high relative to incurred losses, then future paid loss development tends to be lower than average.
2 advantages:
1. it does not require us to model paid losses twice.
2. it explicitly measures the correlation between paid and incurred losses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Issue with using the ODP bootstrap

A

Iterations for the latest few accident years tend to be more variable than what we would expect given the simulations for earlier accident years.
This is due to the fact that MORE age-to-age factors are used to extrapolate the sampled values to develop point estimates for each iteration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to fix the issue with the ODP bootstrap

A

Future incremental values can be extrapolated using the BF or Cape Cod method.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Two drawbacks of GLM bootstrap

A
  1. The GLM must be solved for each iteration of the bootstrap model, which may slow down the simulation
  2. The model is no longer directly explainable to others using age-to-age factors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

4 benefits of GLM bootstrap

A
  1. Fewer parameters helps avoid over-parametrizing the model
  2. Gives us the ability to add parameters for calendar year trends.
  3. Gives us the ability to model data shapes other than triangles
  4. Allows us to match the model parameters to the statistical features found in the data, and to extrapolate those features
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do we produce point estimates using the GLM bootstrap model

A

Unlike the ODP bootstrap that replicates the chain-ladder model we do not apply age-to-age factors to each sample triangle to produce point estimates.
Instead, we fit the same GLM model underlying the residuals to each sample triangle. Then we use the resulting parameters to produce ultimates and reserve point estimates.
Drawback: the additional time required to fit a GLM to each sample triangle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

3 options to deal with extreme outcomes

A
  1. identify the extreme iterations and remove them.
  2. Recalibrate the model (identify the source of the negative incremental losses and remove it if necessary)
  3. Limit incremental losses to zero
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Should the residuals be adjusted so that their average is zero?

A

If the average of the residuals is positive, then re-sampling from the residuals will add variability to the resampled incremental losses. I may also cause the resampled incremental losses to have an average greater than the fitted losses. In this case, the residuals should be adjusted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Using an L-year weighted average for the GLM bootstrap

A
  • We use L years of data by excluding the first few diagonals in the triangle (which leaves us with L+1 included diagonals)
  • This changes the shape of the triangle to a trapezoid
  • The excluded diagonals are given zero weight in the model and fewer calendar year parameters are required.
  • When running the bootstrap simulations, we only need to sample residuals for the trapezoid that was used to parametrize the original model. Because the GLM models incremental claims directly and can be parameterized using a trapezoid. Each parameter set is then used to project the sampled triangles to ultimate.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Using an L-year weighted average for the ODP bootstrap

A
  • We calculate L year average factors instead of all year factors
  • We exclude the first few diagonals when calculating residuals
  • We still sample residuals for the entire triangle when running bootstrap. Because the ODP bootstrap requires cumulative values in order to calculate link ratios. Once we have cumulative values for each sample triangle, we use L-year average factors to project the sample triangles to ultimate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does missing values affect

A
  1. Loss development factors
  2. fitted triangle (if the missing value lies on the last diagonal)
  3. Residuals
  4. Degree of freedom
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Dealing with missing values for ODP bootstrap

A
  1. Estimate the missing value using surrounding values
  2. Exclude the missing value when calculating the loss development factors. No corresponding residual will be calculated for the missing value. Similar to the L-year weighted average, sample for the entire triangle. Once the sample triangles are calculated, we should exclude the cells corresponding to the missing values from the projection process
  3. If the missing value lies on the last diagonal, we can either estimate the value OR we can use the value in the second to the last diagonal to contract the fitted triangle
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Dealing with missing values for GLM bootstrap

A

The missing data simply reduced the number of observation used in the model.
Similar to ODP, we could use any one of the 3 method to estimate the missing data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Managing outlies for ODP bootstrap

A
  1. Exclude the outliers completely (proceed in the same manner as a missing value)
  2. Exclude the outliers when calculating the age-to-age factors and the residuals (similar to missing values), BUT include the outlier cells during the sample triangle projection process. (remove the extreme impact of the incremental cell by excluding the outlier during the fitting process while still including some non-extreme variability by including the cell in the sample triangle projections)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

3 options when excluding outliers to calculate age-to-age factors

A
  1. Exclude in the numerator
  2. Exclude in the denominator
  3. Exclude in the numerator and denominator
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Managing outliers for GLM bootstrap

A

Outliers are treated similarly to missing data.
If the data is not considered representative of real variability, the outliers should be excluded and the model should be parameterized without it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What do we do if there are a significant number of outliers

A
  1. Might indicate that the model is a poor fit to the data
  2. For GLM, new parameters could be chosen OR the distribution of the error could be changed.
  3. For ODP, an L-year weighted average could be used to provide a better model fit.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

3 options to adjust for heteroscedasticity

A
  1. Stratified sampling
  2. Calculating variance parameters
  3. Calculating scale parameters
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Describe stratified sampling

A
  1. Group development periods with homogeneous variances
  2. Sample with replacement from the residuals in each group separately
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Advantage of stratified sampling

A

It’s straightforward and easy to implement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Disadvantage of stratified sampling

A

Some groups may only have a few residuals in them, which limits the amount of variability in the possible outcomes

26
Q

What is heteroecthesious data

A

incomplete or uneven exposures at interim evaluation dates

27
Q

Describe partial first development period data

A

Occurs when the first development column has a different exposure period than the rest of the columns.
This is NOT a problem for parameterizing the ODP bootstrap model since the Pearson residuals use the square root of the fitted value to make them all exposure independent

28
Q

How to adjust for the partial first development period data

A

In a deterministic analysis (not bootstrapping), the most recent accident needs to be adjusted to remove exposures beyond the evaluation date. We can reduce the projected future payments by half to remove the exposures from 6/30 to 12/31.
During ODP bootstrap simulation process, we do the same thing. Once the projected future values have been reduced by half, we simulate the process variance as usual.
Alternatively, we can reduce the future values by half AFTER simulating the process variance

29
Q

Describe the partial last calendar period data

A

occurs when the latest diagonal only has a 6 months development period

30
Q

How to adjust for the partial last calendar period data

A

In a deterministic analysis, we can exclude the latest diagonal when calculating age-to-age factors, interpolate those factors for the exposures in the latest diagonal, and use the interpolated factors to project the future values.
When parameterizing the ODP bootstrap model, we annualize the exposures in the last diagonal to make them consistent with the rest of the triangle. The fitted triangles is calculated based on this annualized triangle to obtain residuals
During the ODP bootstrap simulation process, age-to-age factors are calculated from the annualized sample triangles and interpolated. Then, the latest diagonal in the sample triangle is adjusted back to a six month period. The cumulative values are then multiplied by the interpolated age-to-age factors to project future values. We must reduce the future values for the latest accident year by half

31
Q

Exposure adjustments under the ODP bootstrap model

A

We divide the claim data by earned exposure for each AY. this normally improved the fit of the model
The simulation process is then run on the adjusted data.
After the process variance step is completed, we multiply the results by the earned exposures to restate them in terms of total values

32
Q

Exposure adjustments under the GLM bootstrap model

A

Similar to the ODP, the GLM model is fit to the exposure adjusted losses.
Main difference: exposure adjusted losses with HIGHER exposures are assumed to have LOWER variance when fitting the GLM.
Exposure adjustments could allow fewer AY parameters for the GLM bootstrap model

33
Q

Selecting tail factors for ODP bootstrap model

A

Tail factor can be extrapolated
The tail factor standard deviation is 50% or less of the tail factor -1

34
Q

Selecting tail factors for GLM bootstrap model

A

Assume that the final development period will continue to apply incrementally until its effect on the future incremental claims is negligible

35
Q

Diagnostic tool 1 - Residual graphs

A

Testing the assumption that residuals are independent and identically distributed
We can graph the residuals by development period, accident period or calendar period or against the fitted incremental losses

36
Q

Trends in residual graphs

A

We should be able to draw a relatively flat line through the residuals.
Residuals should appear random

37
Q

Adjusting Heteroscedasticity in residual graphs

A

We should group residuals into hetero groups and adjust them to a common standard deviation.
TO help visualize HOW the residuals should be grouped, we can graph relative standard deviations and look for natural groupings

38
Q

Diagnostic tool 2 - Normality test

A

Although the ODP model does not require residuals to be normally distributed, it’s still helpful to compare residuals against a normal distribution
This allows us to compare parameter sets and assess the skewness of the residuals.
This test uses both graphs AND calculated test values

39
Q

Describe normality plots

A

If the data points tightly distributed around the diagonal line, then the residuals are assumed to be normally distributed

40
Q

Describe calculated test values for testing normality

A
  1. P-value: P-value should be large (greater than 5%). typically based on the Shapiro test for normality
  2. R^2 - R^2 should be close to 1
  3. AIC & BIC, these adjust for the number of parameters used in the model. They should be small
41
Q

How to identify outliers

A

Use a box-whisker plot.
The values beyond the whiskers (the largest values within 3 times the inter-quartile range) are considered outliers

42
Q

Describe the principle of parsimony

A

A model with fewer parameters is preferred as long as the goodness of fit is not markedly different

43
Q

How to find the optimal mix of parameters in the GLM bootstrap model

A
  1. Start with a basic GLM model which includes one parameter for accident, development, and calendar period
  2. Check the residual plots. If it doesn’t look right, we add more parameters.
    The implied development patterns for the GLM should look like a smoothed version of the ODP bootstrap chain-ladder development pattern
44
Q

When reviewing the estimated unpaid model results

A

The standard error should increase when moving from the oldest years to the most recent years (because the standard errors follows the magnitude of the results)
The total standard error should be larger than any individual error
The coefficient of variance should generally decrease when moving from the oldest years to the most recent years. (because the older AYs have fewer payments remaining, which causes all of the variability to be reflected in the coefficient)
The total coefficient of variation should be smaller than any individual year’s coefficient of variation
The standard error or coefficient of variance for all years combined will be LESS than the sum of standard error or coefficient of variation for individual years. Because accident years are assumed to be independent

45
Q

Why the coefficient of variation may rise in the most recent years

A
  1. With an increasing number of parameters in the model, parameter uncertainty increases when moving from the oldest years to the most recent years. This parameter uncertainty may overpower the process uncertainty, causing an increase in variability
  2. The model may simply be overestimating the variability in the most recent years. In this case, the BF or Cape Cod models may need to be used in place of the CL method.
46
Q

Two methods for combining the results from multiple models

A
  1. Run models with the same random variables. Once all the models have been run, the incremental values for each model are weighted together (for each iteration by AY)
  2. Run models with independent random variables. once all the models have been run, the weights are used to select a model (for each iteration by AY) by randomly sampling the specified percentage of iterations from each model. The result is a weighted mixture of models
47
Q

How can we use a smoothed results after fitting the distribution

A
  1. Access the quality of the fit
  2. Parameterize a DFA (dynamic financial analysis) model
  3. Estimate extreme values
  4. Estimate TVaR
48
Q

Benefit of using smoothed results

A

Some of the random noise is prevented from distorting the calculations of specific metrics

49
Q

Reviewing estimated cash flow results

A

For AY, standard errors increase and CoV decrease as we move from older to more recent years.
For CY, standard errors decrease and CoV increase as we move from older the more recent years.

50
Q

How to simulate correlated variables

A

Using a multivariate distribution whose parameters and correlations have been specified.
However, we don’t know the distribution of each BU

51
Q

2 correlation process for the Bootstrap model

A
  1. Location mapping
  2. Re-sorting
52
Q

Describe location mapping

A
  • Pick a BU
  • For each iteration, sample a residual and then note where it belonged in the original residual triangle
  • each of the segments is then sampled using the residuals at the same locations for their respective residuals triangles.
    This preserves the correlation of the original residuals in the sampling process
53
Q

Pros and Cons of location mapping

A

Benefit: it can be easily implemented in a spreadsheet and it does not require us to estimate a correlation matrix.
Cons: it requires all of the business segments to come with residual triangles that are the same size and have no missing values for tress testing purposes

54
Q

Describe re-sorting

A

To cause correlation among BU in a bootstrap model, the residuals are re-sorted until the rank correlation between each business matches the desired correlation.
P-values can be calculated for each correlation coefficient to test its significances

55
Q

Benefits of re-sorting

A

Residual triangles may have different shapes/sizes, different correlation assumptions may be employed AND different correlation algorithms may have beneficial impacts on the aggregate distribution

56
Q

Cons of re-sorting

A

need to specify a desired correlation matrix

57
Q

Advantages of the GLM framework

A
  1. Can tailor the model to the statistical features of the data
  2. Can use fewer parameters to avoid over-parameterization
  3. Can model data that’s not in a loss triangle
58
Q

Disadvantages of the GLM Framework

A
  1. Simulation is slower because the GLM must be solved for in each iteration
  2. Can’t directly explain the model using LDFs
59
Q

Advantages of the ODP bootstrap

A
  1. Can use the simpler LDF method and the model will still be based on the GLM framework
  2. Using LDFs makes the model more easily explainable to others
  3. The GLM uses a log-link and may not work with negative incremental, but the simplified GLM will still get a solution
60
Q

Disadvantages of the ODP bootstrap

A
  1. Unable to adjust for calendar-year effects
  2. Requires many parameters and can over-fit the data