Shapland & Leong Flashcards

1
Q

Provide two advantages of bootstrapping.

A

⇧ They allow us to calculate how likely it is that the ultimate value of the claims will exceed a certain amount ⇧ They are able to reflect the general skewness of insurance losses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Provide one disadvantage of bootstrapping.

A

They are more complex than other models and more time consuming to create

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe how using the over-dispersed Poisson model to model incremental claims relates a GLM to the standard chain-ladder method.

A

If we start with the latest diagonal and divide backwards successively by each age-to-age factor, we obtain fitted cumulative claims. Using subtraction, the fitted cumulative claims can be used to determine the fitted incremental claims. These fitted incremental claims exactly match those obtained using the over-dispersed Poisson model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Briefly describe three important outcomes from this relationship. (ODP & CL)

A

⇧ A simple link ratio algorithm can be used in place of the more complicated GLM algorithm, while still maintaining an underlying GLM framework ⇧ The use of the age-to-age factors serves as a bridge to the deterministic framework. This allows the model to be more easily explained ⇧ In general, the log link function does not work for negative incremental claims. Using link ratios remedies this problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Fully describe the bootstrapping process. Assume the data does not require any modifications.

A

⇧ Calculate the fitted incremental claims using the GLM framework or the chain-ladder ageto- age factors ⇧ Calculate the residuals between the fitted incremental claims and the actual incremental claims ⇧ Create a triangle of random residuals by sampling with replacement from the set of non-zero residuals ⇧ Create a sample incremental triangle using the random residual triangle ⇧ Accumulate the sample incremental triangle to create a sample cumulative triangle ⇧ Project the sample cumulative data to ultimate using the chain-ladder method ⇧ Calculate the reserve point estimate for each accident year using the projected data ⇧ Iterate through this process to create a distribution of reserves for each accident year

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Identify the assumptions underlying the residual sampling process.

A

The residual sampling process assumes that the residuals are independent and identically distributed. However, it does NOT require the residuals to be normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain why these residual sampling assumptions are advantageous.

A

This is an advantage since the distributional form of the residuals will flow through the simulation process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Briefly describe two uses of the degrees of freedom adjustment factor.

A

⇧ The distribution of reserve point estimates from the sample triangles could be multiplied by the degrees of freedom adjustment factor to allow for over-dispersion of the residuals in the sampling process ⇧ The Pearson residuals could be multiplied by the degrees of freedom adjustment factor to correct for a bias in the residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Identify a downfall of the degrees of freedom adjustment factor and state how the issue can be remedied.

A

⇧ The degrees of freedom bias correction does not create standardized residuals. This is important because standardized residuals ensure that each residual has the same variance. In order to calculate the standardized residuals, a hat matrix adjustment factor must be applied to the unscaled Pearson residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Discuss the difference between bootstrapping paid data and bootstrapping incurred data.

A

Bootstrapping paid data provides a distribution of possible outcomes for total unpaid claims. Bootstrapping incurred data provides a distribution of possible outcomes for IBNR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain how the results of an incurred data model can be converted to a paid data model.

A

To convert the results of an incurred data model to a payment stream, we apply payment patterns to the ultimate value of the incurred claims

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain the benefit of bootstrapping the incurred data triangle.

A

Bootstrapping incurred data leverages the case reserves to better predict the ultimate claims. This improves estimates, while still focusing on the payment stream for measuring risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Identify one deterministic method for reducing the variability in the extrapolation of future incremental values.

A

Bornhuetter/Ferguson method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain how this method can be made stochastic. (Reducing variability in the extrapolation of future incremental values)

A

In addition to specifying a priori loss ratios for the Bornhuetter/Ferguson method, we can add a vector of standard deviations to go with these means. We can then assume a distribution and simulate a different a priori loss ratio for every iteration of the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Identify four advantages to generalizing the over-dispersed Poisson model.

A

⇧ Using fewer parameters helps avoid over-parameterizing the model ⇧ Gives us the ability to add parameters for calendar year trends ⇧ Gives us the ability to model data shapes other than triangles ⇧ Allows us to match the model parameters to the statistical features found in the data, and to extrapolate those features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Identify two disadvantages to generalizing the over-dispersed Poisson model.

A

⇧ The GLM must be solved for each iteration of the bootstrap model, which may slow down the simulation ⇧ The model is no longer directly explainable to others using age-to-age factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Provide a disadvantage to including calendar year trends in the over-dispersed Poisson model.

A

By including calendar year trends, the system of equations underlying the GLM no longer has a unique solution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Explain how this problem can be remedied. (CY trends in the ODP)

A

To deal with this issue, we start with a model with one Alpha parameter, one Beta parameter and one Correlation parameter. We then add and remove parameters as needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Negative incremental values can cause extreme outcomes in early development periods. In particular, they can cause large age-to-age factors. Describe four options for dealing with these extreme outcomes.

A

⇧ Identify the extreme iterations and remove them • Only remove unreasonable extreme iterations so that the probability of extreme outcomes is not understated ⇧ Recalibrate the model • Identify the source of the negative incremental losses and remove it if necessary. For example, if the first row has negative incremental values due to sparse data, remove it and reparameterize the model ⇧ Limit incremental losses to zero • This involves replacing negative incremental values with zeroes in the original triangles, zeroes in the sampled triangles OR zeroes in the projected future incremental losses. We can also replace negative incremental losses with zeroes based on their development column ⇧ Use more than one model • For example, if negative values are caused by salvage/subrogation, we can model the gross losses and salvage/subrogation separately. Then, we can combine the iterations assuming 100% correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

ODP: Explain why the average of the residuals may be less than zero in practice.

A

If the magnitude of losses is higher for an accident year that shows higher development than the weighted average, then the average of the residuals will be negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

ODP: Explain why the average of the residuals may be greater than zero in practice.

A

If the magnitude of losses is lower for an accident year that shows higher development than the weighted average, then the average of the residuals will be positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Discuss the arguments for and against adjusting the residuals to an overall mean of zero.

A

⇧ Argument for adjusting the residuals • If the average of the residuals is positive, then re-sampling from the residuals will add variability to the resampled incremental losses. It may also cause the resampled incremental losses to have an average greater than the fitted loss ⇧ Argument against adjusting the residuals • The non-zero average of the residuals is a characteristic of the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

If the decision is made to adjust the residuals to an overall mean of zero, explain the process for doing so.

A

We can add a single constant to all residuals such that the sum of the shifted residuals is zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Describe the process for using an N-year weighted average of losses when determining development factors under the following frameworks: a) GLM framework

A

We use N years of data by excluding the first few diagonals in the triangle (which leaves us with N +1 included diagonals). This changes the shape of the triangle to a trapezoid. The excluded diagonals are given zero weight in the model and fewer calendar year parameters are required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Describe the process for using an N-year weighted average of losses when determining development factors under the following frameworks: b) Simplified GLM framework

A

First, we calculate N-year average factors instead of all-year factors. Then, we exclude the first few diagonals when calculating residuals. However, when running the bootstrap simulations, we must still sample from the entire triangle so that we can calculate cumulative values. We use N-year average factors for projecting the future expected values as well

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Briefly describe three approaches to managing missing values in the loss triangle.

A

⇧ Estimate the missing value using surrounding values ⇧ Exclude the missing value ⇧ If the missing value lies on the last diagonal, we can use the value in the second to last diagonal to construct the fitted triangle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Briefly describe three approaches to managing outliers in the loss triangle.

A

⇧ If these values occur on the first row of the triangle where data may be sparse, we can delete the row and run the model on a smaller triangle ⇧ Exclude the outliers completely ⇧ Exclude the outliers when calculating the age-to-age factors and the residuals, but re-sample the corresponding incremental when simulating triangles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Explain the difference between homoscedastic residuals and heteroscedastic residuals.

A

⇧ Homoscedasticity – residuals are independent and identically distributed ⇧ Heteroscedasticity – residuals are independent, but NOT identically distributed

29
Q

Describe two options when adjusting residuals for heteroscedasticity.

A

⇧ Stratified sampling • Group development periods with homogeneous variances • Sample with replacement from the residuals in each group separately ⇧ Variance parameters • Group development periods with homogeneous variances • Calculate the standard deviation of the residuals in each of the “hetero” groups • Calculate the hetero-adjustment factor for each group • Multiply all residuals in each group by the hetero-adjustment factor for that group • All groups now have the same standard deviation, and we can sample with replacement from among ALL residuals

30
Q

Briefly describe the interaction between heteroscedasticity and credibility.

A

Since there are fewer residuals for older development periods, credibility decreases in the tail of the triangle. It’s important NOT to overreact to “apparent” heteroscedasticity in older development years

31
Q

Define heteroecthesious data.

A

Heteroecthesious data refers to incomplete or uneven exposures at interim evaluation dates

32
Q

Briefly describe two types of heteroecthesious data.

A

⇧ Partial first development period data – occurs when the first development column has a di↵erent exposure period than the rest of the columns ⇧ Partial last calendar period data – occurs when the latest diagonal only has a six-month development period

33
Q

In order to assess the quality of a stochastic model, various diagnostic tests should be run. Identify three purposes of using diagnostic tools.

A

⇧ Test various assumptions in the model ⇧ Gauge the quality of the model fit ⇧ Guide the adjustment of model parameters

34
Q

Describe the process for determining if a model is over-parameterized.

A

⇧ To test whether or not a model is over-parameterized, use the following steps: • Start with the basic model which includes one parameter for accident, development and calendar periods • Use trial and error to find a good fit to the data (i.e. add and remove parameters until a good fit is found) • Run all of the standard diagnostics (normality plots, box-whisker plots, p-values, etc.) and compare them to the model with more parameters • If the diagnostics are comparable, then the model with less parameters is preferred (principle of parsimony)

35
Q

Provide two reasons why the coefficient of variation may rise in the most recent accident years.

A

⇧ With an increasing number of parameters in the model, parameter uncertainty increases when moving from the oldest years to the most recent years. This parameter uncertainty may overpower the process uncertainty, causing an increase in variability ⇧ The model may simply be overestimating the variability in the most recent years

36
Q

Describe two methods for combining the results of multiple stochastic models.

A

⇧ Run models with the same random variables • Each model is run with the exact same random variables. Once all of the models have been run, the incremental values for each model are weighted together (for each iteration by accident year) ⇧ Run models with independent random variables • Each model is run with its own random variables. Once all of the models have been run, weights are used to select a model (for each iteration by accident year).The result is a weighted mixture of models

37
Q

Provide four reasons for fitting a curve to unpaid claim distributions.

A

⇧ Assess the quality of the fit ⇧ Parameterize a DFA (dynamic financial analysis) model ⇧ Estimate extreme values ⇧ Estimate TVaR

38
Q

An actuary is estimating ultimate loss ratios by accident year using a bootstrap model. a) Briefly describe how the actuary can estimate the complete variability in the loss ratio.

A

He can estimate the complete variability in the loss ratio by using all simulated values to estimate the ultimate loss ratio by accident year (rather than just using the values beyond the end of the historical triangle)

39
Q

An actuary is estimating ultimate loss ratios by accident year using a bootstrap model. b) Briefly describe how the actuary can estimate the future variability in the loss ratio.

A

He can estimate the future variability in the loss ratio by using only the future simulated values to estimate the ultimate loss ratio (i.e. add the estimated unpaid losses to the actual cumulative losses to date)

40
Q

Describe the process for creating distribution graphs. Include a discussion on kernel density functions.

A

We can create a total unpaid distribution histogram by dividing the range of all values generated from the simulation into 100 equally sized buckets, and then counting the number of simulations that fall within each bucket. Since simulation results tend to appear jagged, a Kernel density function can be fit to the data to provide a smoothed distribution. Each point of a Kernel density function is estimated by weighting all of the values near that point, with less weight given to points further away

41
Q

Briefly describe two methods for including correlation between bootstrap distributions for different business segments.

A

⇧ Location mapping • Pick a business segment. For each bootstrap iteration, sample a residual and then note where it belonged in the original residual triangle. Then, sample each of the segments using the residuals at the same locations for their respective residual triangles. This preserves the correlation of the original residuals in the sampling process ⇧ Re-sorting • To induce correlation among business segments in a bootstrap model, re-sort the residuals for each business segment until the rank correlation between each segment matches the desired correlation

42
Q

For each method, identify two advantages. (Methods for including correlation between bootstrap distributions for different business segments)

A

⇧ Location mapping • Can be easily implemented in a spreadsheet • Does not require a correlation matrix ⇧ Re-sorting • Works for residual triangles with different shapes/sizes • Different correlation assumptions can be employed

43
Q
A

⇧ The standard errors for accident years 2005-2008 should be increasing over time
⇧ The estimates of unpaid claims for accident years 2011-2013 should be increasing over time
⇧ The coefficients of variation for accident years 2011-2013 should be decreasing over time
⇧ The total standard error should be larger than any of the individual accident year standard
errors

44
Q
A

Since the model iterations were run using different sets of random residuals, the actuary
must use the weights to select a model for each simulated run. For example, suppose 1000
iterations were run for the two models. Since the AY 2013 weights are 25% and 75%, the
actuary would sample 250 iterations from the chain-ladder model and 750 iterations from
the BF model when developing the reserve distribution

45
Q

Mean of Incremental Losses in row 3, age 4

A
46
Q

Variance of Incremental Losses in cell wd

A
47
Q

In the ODP Bootstrap model - Residual Formula

A
48
Q
A
49
Q

What Assumptions do we need so the GLM result is
equal to the Chainladder Result?

A
50
Q

List the 5 bootstrap steps

A
51
Q

When creating the Sampled Triangle, how do we
calculate a Sampled Loss

A
52
Q

Once we have the Sampled Triangle, along with the
mean and variance of each cell
How do we forecast the future losses?

A
53
Q

In the ODP Bootstrap model - Estimate the Dispersion
Factor

A
54
Q

What adjustment did England and Verrall make to the residuals? Why?

A
55
Q

What are Standardized Residuals, and how do we
calculate them?

A
56
Q

Name 3 options to deal with Negative Incremental
Values in the Bootstrap model

A
57
Q

How do we simulate a loss when the mean is
negative?

A
58
Q

What steps could we take if Negative Values are
leading to extreme results

A
  1. Remove Outliers
  2. Recalibrate the model (review data used, parameters selected)
  3. Limit Incremental losses below at zero (in Original Triangle,
    Sampled Triangle, or Simulated loss)
  4. Understand what is driving negative development, this
    will guide you in how to deal with it
59
Q

Name 2 ways to deal with heteroskedastic residuals

A
60
Q

What is Heteroecthesious Data?

A

Data, where the exposures are not the same. Two types:
1. Valuation of losses before year is fully earned
-Last row has not earned a full year; model will still
forecast a full year
2. Last Diagonal is valued in between normal schedule
-Last Diagonal isn’t a full year, so we don’t have a model
for it

61
Q

What is Parametric Bootstrapping?

A

Once the residuals from the original triangle have been
calculated
-Fit a Distribution to these residuals
-Then sample from this distribution, instead of the actual
residuals

62
Q

In Bootstrapping, what is the purpose of running
diagnostics?

A
  • Test Assumptions of Model
  • Gauge the Quality of the Model Fit
  • Adjust Model Parameters
63
Q

When Bootstrapping - what test can we do to our
residuals to determine if there are any outliers?

A

Draw a Box & Whisper Plot
The Box is the 25th and 75th percentiles
The difference between those two is the InterQuartile (IQ)
Range
Any residuals that exceed the 75th percentile by 1.5 times the
IQ Range should be investigated

64
Q

When we run Bootstrap, what diagnostics should we
do on the output of the model?

A

For the estimate of Unpaid losses, for each Accident Year,
calculate the following
- Mean
-Standard Error
-Coefficient of Variation
- Percentiles (50%, 75%, 95%, 99%)
- Minimum & Maximum
-Standard Error should be highest for more recent years
- CoV should be highest for older years; most recent AY will also be high
- Check Min & Max for reasonability

65
Q

When Bootstrapping, how do we incorporate multiple
methods (eg. Chainladder, BF, Paid vs. Incurred)

A

Two Methods to do this
Each requires a weight given to each method, for each AY
1. Take the Weighted average of the models
When running the simulations, each method must use the
same underlying random uniform variable in the Process Step
2. Use the Weights, to randomly select an Unpaid result for
each AY, in each simulation
No need to correlate the methods here

66
Q

How do we convert our Bootstrap output into a
Smooth Distribution Graph

A

Graph a Histogram of the results
Use a Kernel Density function to smooth the results

67
Q

If we Bootstrap multiple lines of business, how can
we correlate their results

A

-Location Mapping
When Sampling, take the sampled residual from the same
location in the triangle for each line of business

-Re-Sorting

Advantages:
– Triangles may be different size by LOB
– May make different correlation assumptions
– Correlation algorithms may have additional benefits

68
Q

Possibilities for future research on the Bootstrap
Method

A
  • Test ODP Bootstrap against additional data sets (CAS loss
    simulation model)
    -Expand model to include: Munich Chain Ladder, Claim Counts & Severity
    -Research Other Risk Measures
    -Use in Solvency II
  • Research Correlation Matrix (parameter that is most difficult
    to estimate)