1. Sort a sample of n predicted percentiles into increasing order 2. Plot the expected percentiles e_i = 100 {1/(n+1), 2/(n+1),...,n/(n+1)} on the x axis and the sorted predicted percentiles on the y axis 3. If these predicted percentiles are uniformly distributed, we expect this plot to lie along a 45-degree line Reject the hypothesis of uniformity if the p-p plot lies outside the 45 degree bands parallel to the line y =x

Meyers Flashcards by Alexa Weber

Why do models fail to accurately predict test data?

The insurance process is too dynamic to be captured in a single model
There could be other models that better fit the data
The data used to calibrate the model is missing crucial information needed to make a reliable prediction

How well did you know this?

Not at all

Perfectly

Test 1: Histogram

If the percentiles are uniformly distributed, the height of the bars should be equal

A symmetric histogram implies that the expected value is accurate

How well did you know this?

Not at all

Perfectly

Test 2: 𝑝−𝑝 Plot & Kolmogorov-Smirnov (K-S) Test

Tests the statistical significance of uniformity

How well did you know this?

Not at all

Perfectly

K-S Test

n = number of predicted percentiles

K-S statistic D = max(ABS(p_i - f_i))
where {p_i} is the set of predicted percentiles and {f_i} = 100 * {1/n,2/n,…,n/n}

Reject the hypothesis of uniformity at the 5% level if D > 136 / SQRT(n)

Model is validated if it passes the K-S test

How well did you know this?

Not at all

Perfectly

p-p Plot

Sort a sample of n predicted percentiles into increasing order
Plot the expected percentiles e_i = 100 * {1/(n+1), 2/(n+1),…,n/(n+1)} on the x axis and the sorted predicted percentiles on the y axis
If these predicted percentiles are uniformly distributed, we expect this plot to lie along a 45-degree line

Reject the hypothesis of uniformity if the p-p plot lies outside the 45 degree bands parallel to the line y =x

How well did you know this?

Not at all

Perfectly

Light-tailed distribution

Actual outcomes fall into the smaller and larger percentiles of the distributions

Forms an S shape on the p-p plot - actual outcomes are falling into percentiles that are lower than expected in the left tail and higher than expected in the right tail

Underestimates the variability of ultimate loss estimates

Confidence intervals will be too small

How well did you know this?

Not at all

Perfectly

Heavy-tailed distribution

Actual outcomes fall into the middle percentiles of the distributions

Forms a backwards S shape on the p-p plot - actual outcomes are falling into percentiles that are higher than expected in the left tail and lower than expected in the right tail

Confidence intervals will be too wide

How well did you know this?

Not at all

Perfectly

Validating the Mack Model - Incurred Losses

Meyers used the Mack model to calculate the mean and SD and fit a lognormal distribution with those parameters - looked at the actual outcome as a percentile of that distribution

Histogram: light-tailed

p-p Plot: light-tailed

K-S Statistic: reject hypothesis of uniformity

How well did you know this?

Not at all

Perfectly

Validating the Bootstrap ODP Model - Paid Losses

Actual outcomes occur in the lower percentiles of the model distributions more often

Produces expected loss estimates that are biased high - more of the actual outcomes will fall in lower percentiles because the model distributions are shifted too far to the right

Expected loss estimates are too high

K-S Statistic: reject hypothesis of uniformity

How well did you know this?

Not at all

Perfectly

Validating the Mack Model - Paid Losses

Actual outcomes occur in the lower percentiles of the model distributions more often

Produces expected loss estimates that are biased high - more of the actual outcomes will fall in lower percentiles because the model distributions are shifted too far to the right

Expected loss estimates are too high

K-S Statistic: reject hypothesis of uniformity

How well did you know this?

Not at all

Perfectly

Bayesian Models for Incurred Loss Data

Increases the variability of the predictive distribution and extends the tails

Treats the level of the AY as random to predict more risk - in contrast to Mack, where the observed losses act as fixed level parameters
Allows for correlation between AYs - in contrast to Mack, where AYs are independent

How well did you know this?

Not at all

Perfectly

Leveled Chain-Ladder (LCL) Model

The level of each AY is defined as mu_w,d = alpha_w + beta_d

The simulated cumulative loss C_w,d has a lognormal distribution with log mean mu_w,d and log SD sigma_d, subject to the constraint that sigma_1 > sigma_2 > … > sigma_10

SD is larger for earlier development periods where there are more claims open and more variability

Each parameter is given a wide prior distribution so that the posterior distributions will be highly influenced by the data during the Bayesian MCMC process

How well did you know this?

Not at all

Perfectly

Correlated Chain-Ladder (CCL) Model

Allows for correlation between each subsequent mu parameter

The level of each AY is defined as mu_w,d = alpha_w + beta_d + p * (LN(C_w-1,d) - mu_w-1,d)

The correlation parameter p is given a wide prior distribution and when p = 0, the CCL model reduces to the LCL model

For each parameter set, start with the given C_1,10 and calculate the mean mu_2,10
Simulate C_2,10 from a lognormal distribution with log mean mu_2,10 and log SD sigma_10
Use the result of this simulation to simulate the next ultimate loss
Do this process many times to form a predictive distribution for each AY and in total

How well did you know this?

Not at all

Perfectly

Validating the LCL Model - Incurred Losses

Histogram: light-tailed

p-p Plot: light-tailed

K-S Statistic: reject hypothesis of uniformity

Better results than Mack with higher SD since additional variability was introduced

How well did you know this?

Not at all

Perfectly

Validating the CCL Model - Incurred Losses

Histogram: light-tailed

p-p Plot: light-tailed, but all points lie inside the bounds

K-S Statistic: uniformity

Better results than Mack with higher SD since additional variability was introduced (also higher than LCL)

How well did you know this?

Not at all

Perfectly

Validating the CCL Model - Paid Losses

Study These Flashcards

Produces expected loss estimates that are biased high - more of the actual outcomes will fall in lower percentiles because the model distributions are shifted too far to the right

CY Trend in Paid Losses

Study These Flashcards

The model should be based on incremental paid loss amounts since cumulative losses include settled claims which do not change with time
Incremental paid loss amounts tend to be skewed to the right and can be negative, need a loss distribution that allows for these features

Skew Normal Distribution Form 1

Study These Flashcards

Location parameter mu
Scale parameter omega
Shape parameter delta

X ~ mu + omega * delta * Z + omega * SQRT(1-delta^2) * e

where Z is the truncated normal distribution that only takes on positive values and e is the normal distribution

Delta = 0 -> normal distribution
As delta approaches 1, the distribution becomes more skewed
Delta = 1 -> truncated normal distribution

This form caps the coefficient of skewness to that of the truncated normal distribution

Skew Normal Distribution Form 2

Study These Flashcards

Replaces the truncated normal distribution with the lognormal distribution

Correlated Incremental Trend (CIT) Model

Study These Flashcards

mu_w,d = alpha_w + beta_d + tau * (w + d -1)

Z_w,d ~ lognormal(mu_w,d, sigma_d), subject to the constraint that sigma_1 < sigma_2 < … < sigma_10

I_w,d ~ normal(Z_w,d + p * (I_w-1,d - Z_w-1,d) * EXP(tau), delta)

For w = 1 -> I_1,d ~ normal(Z_1,d, delta)

Distribution is skewed, allows for negative values, and has payment trend tau

Each parameter is given a wide prior distribution so that the posterior distributions will be highly influenced by the data during the Bayesian MCMC process, EXCEPT for tau and sigma which were given more restrictive prior distributions

Comparing the CIT & CCL Models

Study These Flashcards

Since CCL model is applied to cumulative losses, sigma_d decreases as d increases since a greater proportion of claims are settled (less variability)
Since CIT model is applied to incremental losses, sigma_d increases as d increase since smaller, less volatile claims tend to be settled earlier
Since there is a possibility of negative incremental losses, the correlation feature is applied to the log of the cumulative losses in the CCL model (outside of the log)

Leveled Incremental Trend (LIT) Model

Study These Flashcards

CIT model without AY correlation

Validating the CIT Model - Paid Losses

Study These Flashcards

Produces estimates that are biased high

No improvement over Mack or ODP models

Validating the LIT Model - Paid Losses

Study These Flashcards

Produces estimates that are biased high

No improvement over Mack or ODP models

Changing Settlement (CSR) Model

Reflects the speedup in claim settlement due to technology Uses cumulative paid losses due to no longer considering a payment trend No correlation or trend terms mu_w,d = alpha_w + beta_d * (1 - y)^(w-1) C_w,d has a lognormal distribution with log mean mu_w,d and log SD sigma_d, subject to the constraitn sigma_1 > sigma_2 > ... > sigma_10 y > 0 indicates a speedup in claim settlement

Validating the CSR Model - Paid Losses

Histogram, p-p Plot, and K-S Statistic indicate uniformity Suggest that the incurred data recognized the speed-up in claims settlement rate for the CCL model

Process Risk

Represents the average variance of the outcomes from the expected result E[Var(X|theta)]

Parameter Risk

Represents the variance due to many possible parameters in the posterior distribution of the parameter Var[E(X|theta)]

Total Risk

Total Risk = Process Risk + Parameter Risk

Model Risk

The risk that we did not select the right model Shows up in the process risk portion of the total risk 1. Formulate a model that is a weighted average of the various candidate models, where the weights are the parameters 2. If the posterior distribution of the weights assigned to each model has significant variability, then model risk exists

Incurred Data Models

- Mack model understates variability - CCL model allows for AY correlation and predicts the distribution of outcomes correctly within a specified confidence level

Paid Data Models

- Bootstrap ODP, Mack, and CCL models give estimates of the expected ultimate loss that are biased high, suggesting there is a change in the loss environment that is not being captured in the models - CIT and LIT introduce CY trends but fail to improve - CSR model introduces a parameter to account for speedup in claims settlement rates and predicts the distribution of outcomes correctly within a specified confidence level

Meyers Flashcards

(32 cards)