Meyers Flashcards

1
Q

Why do models fail to accurately predict test data?

A
  1. The insurance process is too dynamic to be captured in a single model
  2. There could be other models that better fit the data
  3. The data used to calibrate the model is missing crucial information needed to make a reliable prediction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Test 1: Histogram

A

If the percentiles are uniformly distributed, the height of the bars should be equal

A symmetric histogram implies that the expected value is accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Test 2: 𝑝−𝑝 Plot & Kolmogorov-Smirnov (K-S) Test

A

Tests the statistical significance of uniformity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

K-S Test

A

n = number of predicted percentiles

K-S statistic D = max(ABS(p_i - f_i))
where {p_i} is the set of predicted percentiles and {f_i} = 100 * {1/n,2/n,…,n/n}

Reject the hypothesis of uniformity at the 5% level if D > 136 / SQRT(n)

Model is validated if it passes the K-S test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

p-p Plot

A
  1. Sort a sample of n predicted percentiles into increasing order
  2. Plot the expected percentiles e_i = 100 * {1/(n+1), 2/(n+1),…,n/(n+1)} on the x axis and the sorted predicted percentiles on the y axis
  3. If these predicted percentiles are uniformly distributed, we expect this plot to lie along a 45-degree line

Reject the hypothesis of uniformity if the p-p plot lies outside the 45 degree bands parallel to the line y =x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Light-tailed distribution

A

Actual outcomes fall into the smaller and larger percentiles of the distributions

Forms an S shape on the p-p plot - actual outcomes are falling into percentiles that are lower than expected in the left tail and higher than expected in the right tail

Underestimates the variability of ultimate loss estimates

Confidence intervals will be too small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Heavy-tailed distribution

A

Actual outcomes fall into the middle percentiles of the distributions

Forms a backwards S shape on the p-p plot - actual outcomes are falling into percentiles that are higher than expected in the left tail and lower than expected in the right tail

Confidence intervals will be too wide

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Validating the Mack Model - Incurred Losses

A

Meyers used the Mack model to calculate the mean and SD and fit a lognormal distribution with those parameters - looked at the actual outcome as a percentile of that distribution

Histogram: light-tailed

p-p Plot: light-tailed

K-S Statistic: reject hypothesis of uniformity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Validating the Bootstrap ODP Model - Paid Losses

A

Actual outcomes occur in the lower percentiles of the model distributions more often

Produces expected loss estimates that are biased high - more of the actual outcomes will fall in lower percentiles because the model distributions are shifted too far to the right

Expected loss estimates are too high

K-S Statistic: reject hypothesis of uniformity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Validating the Mack Model - Paid Losses

A

Actual outcomes occur in the lower percentiles of the model distributions more often

Produces expected loss estimates that are biased high - more of the actual outcomes will fall in lower percentiles because the model distributions are shifted too far to the right

Expected loss estimates are too high

K-S Statistic: reject hypothesis of uniformity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Bayesian Models for Incurred Loss Data

A

Increases the variability of the predictive distribution and extends the tails

  1. Treats the level of the AY as random to predict more risk - in contrast to Mack, where the observed losses act as fixed level parameters
  2. Allows for correlation between AYs - in contrast to Mack, where AYs are independent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Leveled Chain-Ladder (LCL) Model

A

The level of each AY is defined as mu_w,d = alpha_w + beta_d

The simulated cumulative loss C_w,d has a lognormal distribution with log mean mu_w,d and log SD sigma_d, subject to the constraint that sigma_1 > sigma_2 > … > sigma_10

SD is larger for earlier development periods where there are more claims open and more variability

Each parameter is given a wide prior distribution so that the posterior distributions will be highly influenced by the data during the Bayesian MCMC process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Correlated Chain-Ladder (CCL) Model

A

Allows for correlation between each subsequent mu parameter

The level of each AY is defined as mu_w,d = alpha_w + beta_d + p * (LN(C_w-1,d) - mu_w-1,d)

The correlation parameter p is given a wide prior distribution and when p = 0, the CCL model reduces to the LCL model

  1. For each parameter set, start with the given C_1,10 and calculate the mean mu_2,10
  2. Simulate C_2,10 from a lognormal distribution with log mean mu_2,10 and log SD sigma_10
  3. Use the result of this simulation to simulate the next ultimate loss
  4. Do this process many times to form a predictive distribution for each AY and in total
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Validating the LCL Model - Incurred Losses

A

Histogram: light-tailed

p-p Plot: light-tailed

K-S Statistic: reject hypothesis of uniformity

Better results than Mack with higher SD since additional variability was introduced

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Validating the CCL Model - Incurred Losses

A

Histogram: light-tailed

p-p Plot: light-tailed, but all points lie inside the bounds

K-S Statistic: uniformity

Better results than Mack with higher SD since additional variability was introduced (also higher than LCL)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Validating the CCL Model - Paid Losses

A

Produces expected loss estimates that are biased high - more of the actual outcomes will fall in lower percentiles because the model distributions are shifted too far to the right

17
Q

CY Trend in Paid Losses

A
  • The model should be based on incremental paid loss amounts since cumulative losses include settled claims which do not change with time
  • Incremental paid loss amounts tend to be skewed to the right and can be negative, need a loss distribution that allows for these features
18
Q

Skew Normal Distribution Form 1

A

Location parameter mu
Scale parameter omega
Shape parameter delta

X ~ mu + omega * delta * Z + omega * SQRT(1-delta^2) * e

where Z is the truncated normal distribution that only takes on positive values and e is the normal distribution

Delta = 0 -> normal distribution
As delta approaches 1, the distribution becomes more skewed
Delta = 1 -> truncated normal distribution

This form caps the coefficient of skewness to that of the truncated normal distribution

19
Q

Skew Normal Distribution Form 2

A

Replaces the truncated normal distribution with the lognormal distribution

20
Q

Correlated Incremental Trend (CIT) Model

A

mu_w,d = alpha_w + beta_d + tau * (w + d -1)

Z_w,d ~ lognormal(mu_w,d, sigma_d), subject to the constraint that sigma_1 < sigma_2 < … < sigma_10

I_w,d ~ normal(Z_w,d + p * (I_w-1,d - Z_w-1,d) * EXP(tau), delta)

For w = 1 -> I_1,d ~ normal(Z_1,d, delta)

Distribution is skewed, allows for negative values, and has payment trend tau

Each parameter is given a wide prior distribution so that the posterior distributions will be highly influenced by the data during the Bayesian MCMC process, EXCEPT for tau and sigma which were given more restrictive prior distributions

21
Q

Comparing the CIT & CCL Models

A
  • Since CCL model is applied to cumulative losses, sigma_d decreases as d increases since a greater proportion of claims are settled (less variability)
  • Since CIT model is applied to incremental losses, sigma_d increases as d increase since smaller, less volatile claims tend to be settled earlier
  • Since there is a possibility of negative incremental losses, the correlation feature is applied to the log of the cumulative losses in the CCL model (outside of the log)
22
Q

Leveled Incremental Trend (LIT) Model

A

CIT model without AY correlation

23
Q

Validating the CIT Model - Paid Losses

A

Produces estimates that are biased high

No improvement over Mack or ODP models

24
Q

Validating the LIT Model - Paid Losses

A

Produces estimates that are biased high

No improvement over Mack or ODP models

25
Q

Changing Settlement (CSR) Model

A

Reflects the speedup in claim settlement due to technology

Uses cumulative paid losses due to no longer considering a payment trend

No correlation or trend terms

mu_w,d = alpha_w + beta_d * (1 - y)^(w-1)

C_w,d has a lognormal distribution with log mean mu_w,d and log SD sigma_d, subject to the constraitn sigma_1 > sigma_2 > … > sigma_10

y > 0 indicates a speedup in claim settlement

26
Q

Validating the CSR Model - Paid Losses

A

Histogram, p-p Plot, and K-S Statistic indicate uniformity

Suggest that the incurred data recognized the speed-up in claims settlement rate for the CCL model

27
Q

Process Risk

A

Represents the average variance of the outcomes from the expected result

E[Var(X|theta)]

28
Q

Parameter Risk

A

Represents the variance due to many possible parameters in the posterior distribution of the parameter

Var[E(X|theta)]

29
Q

Total Risk

A

Total Risk = Process Risk + Parameter Risk

30
Q

Model Risk

A

The risk that we did not select the right model

Shows up in the process risk portion of the total risk

  1. Formulate a model that is a weighted average of the various candidate models, where the weights are the parameters
  2. If the posterior distribution of the weights assigned to each model has significant variability, then model risk exists
31
Q

Incurred Data Models

A
  • Mack model understates variability
  • CCL model allows for AY correlation and predicts the distribution of outcomes correctly within a specified confidence level
32
Q

Paid Data Models

A
  • Bootstrap ODP, Mack, and CCL models give estimates of the expected ultimate loss that are biased high, suggesting there is a change in the loss environment that is not being captured in the models
  • CIT and LIT introduce CY trends but fail to improve
  • CSR model introduces a parameter to account for speedup in claims settlement rates and predicts the distribution of outcomes correctly within a specified confidence level