Shapland Flashcards
Technique to address model risk
weight multiple models
Error distributions (4)
- normal, z = 0
- Poisson, z = 1
- Gamma, z = 2
- inverse Gaussian, z = 3
Variance of incremental claims
Shapland
var(q(w,d)) = phi * m^z
m is fitted incremental loss
Linear predictor GLM parameters (4)
- c = constant level parameter
- alpha = AY adjustments to constant level parameter
- beta = development period parameter
- gamma = CY trend parameter
Consequences of ODP fitted incremental claims = CL incremental claims (3)
(Shapland)
- simple link ratio algorithm can be used w/in GLM framework
- use of age-to-age factors allows the model to be easily explained
- allows for negative incremental claims (which would be a problem to model)
Unscaled (aka normalized) residual
Shapland
r_p = (q - m) / sqrt(m^z)
q = actual incremental losses
m = fitted incremental losses
z = 1 for ODP
Scale parameter formula
sum(r^2) / (N - p)
where N = # data cells and p = # parameters
**should always be calculated from unscaled residuals
Number of parameters
Shapland
= 2 * # AYs - 1
or with additional development period/adjustment parameters:
= # AYs
+ (# development period parameters - 1)
+ (# hetero adjustment groups - 1)
Scaled residuals and what scaling accounts for
scaled residual = unscaled residual * DOF adj. factor
DOF adj. factor = sqrt ( N / (N - p))
> > corrects for bias/over-dispersion in residuals
Standardized residuals and what standardization accounts for
standardized residuals = unscaled residual * hat matrix adjustment factor
hat matrix adjustment factor = sqrt ( 1 / (1 - ith position on diagonal of hat matrix) )
> > ensures residuals have constant variance
Scale parameter approximation using standardized residuals
scale parameter = sum (squared standardized residuals) / N
How to incorporate process variance into incremental claims estimates
assume each future incremental value is the mean and each variance(fitted incremental value) the variance of a gamma distribution and simulate future incremental losses
Outcomes when modeling paid vs. incurred data
paid data - outcomes represent total unpaid
incurred data - outcomes represent IBNR
Common problem with ODP bootstrap model
Shapland
most recent AYs have more variance than expected b/c more age-to-age factors are used to extrapolate the sample values
> > correct for this by using BF/CC method
Limitations of the ODP bootstrap model (2)
- does not account for CY effects
- tends to over-parameterize the model
Drawbacks to the GLM bootstrap model (2)
- GLM must be solved with every iteration, which slows down simulation
- model is no longer directly explainable with age-to-age factors
Benefits of the GLM bootstrap model (4)
flexibility
- fewer parameters help to avoid over-parameterization
- ability to add CY trend parameter
- flexibility to model triangles with incomplete data
- allows for matching model parameters to the statistical features of the data
ODP vs. GLM bootstrap sampling with trapezoidal data OR when using an L-yr weighted average of LDFs
ODP - models cumulative claims instead of incremental, sample from trapezoid to entire triangle
GLM - models incremental claims directly, sample from trapezoid to trapezoid
ODP (3) vs. GLM bootstrap methods for handling missing data
ODP -
- estimate missing value from surrounding values
- exclude missing value from LDF calculations (and no corresponding residual)
- if on latest diagonal, estimate from surrounding or use 2nd latest diagonal value
GLM - simply reduce N
Options for handling outliers in the ODP bootstrap model (2) and GLM bootstrap model
ODP -
- exclude and treat as a missing value
- exclude residual but still sample that cell; this removes extreme residual but keeps variability; use sample value to calculate LDFs and ultimate
GLM - treat like a missing value
Methods for handling heteroscedasticity under ODP (3) and GLM bootstrap models
ODP -
- stratified sampling
- variance parameter adjustment
- scale parameter adjustment
GLM - adding/removing parameters can reduce heteroscedasticity
Advantage and disadvantage of stratified sampling
advantage - simple and straightforward to implement
disadvantage - small number of residuals limits variability
Type of residuals used in variance parameter adjustment and scale parameter adjustments for heteroscedasticity
variance parameter adjustment»_space; standardized residuals
scale parameter adjustment»_space; use unscaled residuals to calculated adjustment factors but adjust standardized residuals
Variance parameter adjustment factor
= standard deviation (all residuals) / standard deviation (residuals in group i)
Disadvantage of variance and scale parameter adjustments and how to correct them
both alter the original distribution of residuals
> > adjust back to normal after re-sampling by dividing sampled residuals by destination group adjustment factor
Number of parameters for variance and scale parameter adjustments for heteroscedasticity
group adjustment factors are considered new parameters (impacts p)
Scale parameter adjustment factors
= sqrt ( overall scale parameter ) / sqrt ( scale parameter for group i )
scale parameter = [ (N / (N - p)) * sum (squared unscaled residuals) ] / N
Handling partial 1st development period data under the ODP bootstrap model
reduce (scale) projected payments
Handling partial last calendar period data under the ODP bootstrap model
annualize exposures when calculating LDFs and reduce (scale) projected payments
Tail factor decay model and standard deviation rule of thumb for the ODP bootstrap model
under decay model: next age-to-ultimate factor = 1 + decay percentage * (prior age-to-ultimate - 1)
standard deviation (tail factor) <= .5 * (tail factor - 1)
Tail factors in the GLM bootstrap model
implicit assumption that last development/CY period parameters applies incrementally until the effect is negligible (get a tail factor without needing to specify one)
Purposes of diagnostic tests (3)
- test model assumptions
- gauge quality of fit
- guide adjustment of model parameters
Test for whether residuals are i.i.d. (and alternative visualization)
plot residuals against development period/AY/CY and look for a random pattern
- alternative: can graph relative standard deviations to visualize groupings
Tests for normality (4)
- plot of normal inverse against residuals - want residuals tightly distributed around diagonal
- p-value > 5%
- R^2 value close to 1.00
- small AIC or BIC values (which penalize for additional parameters)
**residuals required to be i.i.d., but not normal
When to add/remove parameters
add: if residuals are not randomly scattered around the 0 line
remove: if not statistically significant
Relationship between standard error, age, and CoV and individual years vs. total for AY and CY
For AY
SE is larger for recent years because there are more unpaid losses; decreases with age (CY is opposite)
CoV smaller for recent years (except most recent) due to large unpaid losses; variability is offset
CoV all years < CoV individual (diversification)
most recent CoV high due to parameter uncertainty
Reasons for CoV to increase in most recent years (2)
- parameter uncertainty increases as age decreases, so with additional parameters, parameter variance may overpower process uncertainty
- model may be over-estimating variability in most recent years (use BF/CC)
Methods for combining multiple models (2)
- run models with the same random variables (sample residuals from the same position) - each model gets a weight for every iteration (correlated results)
- run models with independent random variables - for each iteration only use 1 model where model selected is determined by weights
Correlation processes for aggregating LOB results (2)
- location mapping - selects residuals from same location in all triangles and preserves correlation
- re-sorting - specify desired correlation by re-sorting residuals until the rank correlation matches the desired correlation
How to solve for GLM parameters (Shapland)
Solve for the parameters that minimize the squared difference between ln(actual incremental losses) and ln(expected incremental losses).
Handling negative incremental values for the GLM bootstrap model (2)
- If sum of incremental losses in column > 0 use the modified log link function
- If sum of incremental losses in the column < 0 shift the entire loss triangle by the amount of the largest negative value. After running the GLM reverse the adjustment on the fitted incremental losses (before calculating residuals)