Meyers Flashcards
When models do not accurately predict dist of outcomes for test data, 3 explanations
- Insurance process is too dynamic to be captured by single model
- Could be other models that better fit data
- Data used to calibrate model is missing crucial info needed to a make reliable prediction
3 tests to validate models
- histogram
- p-p plot
- K-S statistic
histogram
- if percentiles are uniformly distributed, height of bars should be equal
- for small sample, not perfectly level
- if level, model is appropriate
p-p plot
- tests for stat significance of uniformity
- plot expected percentiles on x and sorted predicted percentiles on y -> if predicted percentiles are uniformly dist, plot lies along 45 degree line
ie model is appropriate if p-p plot lies along 45 degree line
expected value e = {1/(n+1),…,n/(n+1)}
K-S statistic
D=max|pi-fi|
fi = 100*{1/n,…,n/n}
- can reject hypothesis that set of percentiles is uniform @ 5% level if D > critical value = 136/sqrt(n)
- critical values appear as 45 degree bands that run parallel to y=x
- Meyers deems model validated if passes K-S test
Validating Mack: results
- incurred data
- on histogram, percentiles show little uniformity and actual outcomes are falling into smaller and larger percentiles more often -> Mack produces dist that is light tailed
- in p-p plot, predicted percentiles form S shape -> light tailed because actual outcomes failing into percentiles that are lower than expected in left tail and higher in right tail
- D > critical value
Validating ODPB: results
- paid data
- predicted outcomes are occurring in lower percentiles more often -> implies both models produce expected loss estimates that are biased high when modeling paid losses
- producing higher expected loss estimates, left tail becomes lighter
- D > critical value
Possible reasons for observations for paid and incd data ie Mack and ODPB results
- insurance loss environment has experience changes that are not yet observable
- other models that can be validated
Bayesian models for Incurred loss data
-Mack model underestimates variability of predictive distribution which leads to light tails
Leveled Chain Ladder (LCL)
Correlated Chain Ladder (CCL)
Leveled Chain Ladder (LCL)
- treats level of AY as random ie independence between AY -> model will predict more risk
- sigma is larger for earlier DPs where more claims open and more variability
Correlated Chain Ladder (CCL)
- allows for correlation between AYs -> model will predict more risk than LCL
- should result in larger standard deviation for predicted distribution (heavier in tails), which would result in percentiles of outcomes to be more uniform than LCL
LCL results
- produce higher std dev than Mack
- has S shape and some points lie outside K-S bounds, but improvement over Mack, & D is closer to critical value
CCL results
- produce higher std dev than Mack
- CCL produced higher std dev for each AY than LCL
- CCL has S shape and all points within bounds and D is smaller than critical value -> model validates against data and exhibits uniformity
Bayesian models for Paid loss data
-CCL model produced estimates that were biased high
Correlated Incremental Trend (CIT)
Leveled Incremental Trend (LIT)
Correlated Incremental Trend (CIT)
- introduces payment trend and dist is skewed right and allows for negative values (model should be based on increm paid)
- sigma is smaller for earlier DPs
- opposite from LCL b/c increm loss