Shapland Flashcards

1
Q

What’s the purpose of GLM model?

A

to model the incremental losses q(w,d)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to handle negative incremental value in triangle?

A
  1. If sum of the column is negative -
    • subtract from every cell in the triangle by the greatest negative value
    • set greatest negative cell to 0
    • solve the GLM using the modified triangle
    • add back to fitted incremental values the greatest negative value
  2. If just value is negative but the column sum is overall positive - see picture
    then solve the GLM
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to calculate Standardized Residual

A

Residual divded by Standard Deviation (aka. standard error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Model Result Reasonability check for Standard Error

A

Standard Error
- S.E. should increase from older to more recent years (bc s.e. follows the magnitude of the results)
- Total s.e. should be larger than any individual year s.e.
- Total s.e. should be less than the sum of s.e. across all AYs (since the model assumes independence btw AYs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Model Result Reasonability check for CoV

A

CoV
- CoV should GENERALLY decrease when moving from oldest to most recent years
- CoV may rise in the most recent year’s due to
1. Increasing parameters will bring in higher parameter uncertainty to more recent years
2. The model may be overestimating the variability in the most recent years
- Total CoV should be smaller than any individual year’s CoV
- Total CoV should be less than the sum of CoV across all AYs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why homoscedasticity is needed for bootstrapping

A

Bootstrap assumes residuals to be independent and identically distributed. Heteroscedasticity will violate this assumption since the variance of different residuals will be different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the solutions for heteroscedasticity?

A
  1. Stratified sampling
  2. Calculating Variance Params
  3. Calculating Scale Params
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How to handle heteroscedasticity with stratified sampling

A

Group development periods with homogeneous variances
Sample with replacement from the groups only
Disadvantage - some groups may have limited residuals thus reduced credibility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How to handle heteroscedasticity with Calculating Variance Parameter

A

Group development periods with homogeneous variances
Calculate s.e. of the standardized residuals in each of the “hetero” groups
Calculate the hetero-adjustment factor hi = (all std residuals combined)/(std residuals in group i)
Multiply all residuals in group i by the hi
Sample with replacement within ENTIRE triangle. Divide the resampled residuals by hi corresponding to the cell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to handle heteroscedasticity with scale parameters

A

Hetero-adj facors based on different scale params
use the ratio of SQRT(overall scale param) to SQRT(scale param by age i).

N = total cells in the triangle
p = alpha (AYs) + beta (development periods,usually alpha -1)+ #hetero adj factors
#hetero adj factors = #of hetero groups - 1
ni= # cells in group i

rw,d here is pearson (uncaled) residual, not standardized residual

Then hetero-adj factors are used the same way as in hetero-adj based on s.e. of residuals
see below for scale param for hetero group i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to handle Exposure Change in triangle

A

Divide all loss data by exposure for each AY to get Pure Premium
Run model based on pure premium
apply back exposure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to handle Heteroecthesious data in triangle (partial first development period data)

A

Partial first development period data -
reduce future incremental losses for the latest AY to correspond to the earned exposure
-> then simulate process variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to handle heteoecthesious data (partial last calendar period data)

A

Partial last calendar period data -

Annualize the last triangle so that they're in line with the rest of the triangle    Calculate the fitted triangle and residuals   During ODP bootstrap simulation, calculate and interpolate LDFs from the fully annualized sample triangles
 De-annualize last triangle
 Project future values by multiplying the interpolated LDFs with the new cumul values
 Reduce future incr values for the latest AY to remove future exposure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Formula for unscaled Pearson unscaled residual

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

General ODP Bootstrap

A
  • calculate the age-to-age factors
  • calculate the fitted cumulative losses by starting with the latest diagonal and backup the triangle
  • calculated the fitted incremental losses
  • calculate the actual incr loss
  • calculate the Pearson residuals
  • Calculate the hat matrix adj factors
  • calculate the standardized residuals
  • randomly sample from the standardized residuals with replacement
  • convert the random standardized residuals into sample incremental losses
  • use the sample incremental losses to create a triangle of sample cumulative losses
  • project the sample cumul losses to ultimate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

GLM model setup

A
  • set up below graph
  • once set up, fit the model to incr loss triangle using iterated least squares or maximum likelihood
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Formula for unscaled pearson residuals

A

note that z=1 for poisson (most of the time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Formula for standardized residuals

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The power z is in estimated variance for each distribution

A

Possion z=1
Gamma z=2
Inverse Gaussian z =3

20
Q

Ways to handle Outliers

A
  • if extreme values exist in the original triangle, we can remove the impact from the model
  • if using the ODP bootstrap model, use below
    • exclude the outliers completely, treat as missing value
    • exclude the outliers from ATA factors and residual calculations, but include the outlier cells during the resample triangle projection process
21
Q

Three options to remove outliers when calculating ATA factors

A
  • exclude the row only if the outlier is in the numerator
  • exclude the row only if the outlier in the denominator
  • exclude the row if outlier is in either the numerator or the denominator
22
Q

What do we do if significant amount of outliers?

A

May indicate poor fit of model
For GLM bootstrap, choose new params/change the distribution
For ODP bootstrap, L-yr wtd avg can be used to provide a better model fit, but if skewness is real, then the bootstrap will keep it

23
Q

how to handle missing values?

A

GLM Bootstrap Model -
Missing data simply reduces the number of observations in the data

ODP Bootstrap Model -
estimate from surrounding values
or, modify LDFs to exclude missing values

Solution 1: estimate missing values from surrounding values
Solution 2: modify LDFs to exclude the missing value, no residual for missing value -> don’t resample from the missing value
Solution 3: if missing value is on the latest diagonal, estimate value/ use value in the 2nd to last diagonal
24
Q

Negative Values during simulation of process variance (aka. mw,d is negative), how to handle?

A

Option 1 :
- change the sign of simulated value
Option 2:
- shift the entire distribution to have a mean of mw,d

25
Advantages of bootstrap model
Generates a distribution of possible outcomes as opposed to a single point estimate — provides more info o potential results, can be used for capital modeling Can be modified to the statistical features of data under analysis Preserve the original data distribution (skewness etc)
26
Reasons for more focus by actuaries on unpaid claims distributions
SEC is looking for more reserving risk information from publicly traded companies Major rating agencies have dynamic risk models for rating and welcome input from company actuaries about reserve distributions Companies use dynamic risk models for internal risk management and need unpaid claim distributions
27
ODP Model overview
Incremental claims q(w,d) are modeled directly using a GLM GLM structure: Log link Over-dispersed Poisson error distribution Steps 1) Use the model to estimate parameters 2) Use bootstrapping (sampling residuals with replacement) to estimate the total distribution
28
Simplified GLM model
Fitted (expected) incrementals using a Poisson error distribution are the same as incremental losses using volume-weighted average LDF Simplified GLM Method 1) use cumulative claim triangle to calculate LDF 2) Develop losses to ultimate 3) Calculate the expected cumulative triangle 4) Calculate the expected incremental triangle from the cumulative triangle
29
Bootstrapping BF and Cape Cod Models
With ODP bootstrap model, iterations for the latest few accident years can result in more variance than expected BF Method Incorporate BF model by using a priori loss ratios for each AY with standard deviations for each loss ratio and an assumed distribution During simulation, for each iteration simulate a new a priori loss ratio Cape Cod Method Apply the Cape Cod algorithm to each iteration of the bootstrap model
30
Generalizing the ODP Model: pros/cons of using fewer parameters
Pros Help avoid potential over-parametrizing the model Allows the ability to add parameters for calendar-year trends Can be used to model data shapes other than data in triangle form Cons GLM must be solved for each iteration of the bootstrap model, slowing simulations The model is no longer directly explainable to others using age-to-age factors
31
Options to address negative incremental values during simulation
1) Remove extreme iterations from results 2) Recalibrate the model after identifying the sources of negative incrementals Eg remove a row with sparse data when the product was first written 3) Limit incremental losses to zero Replace negative incrementals in original data with zero
32
How to treat non-zero sum of residual
Residuals calculated in a bootstrap model are just error terms, so they should be identically distributed with mean of zero (although not the case usually) Non zero doesn’t mean incompatible with the true distribution But to set the errors to zero, can add a constant from all residuals
33
Using N year wtd average
With GLM: Exclude the first few diagonals and only use N+1 diagonal to parameterize the model Run bootstrap simulations and only sample residuals for the trapezoid that’s used to parameterize the model With simplified GLM Calculate N year avg LDFs Run bootstrap simulation, sampling residuals for the entire triangle in order to calculate cumulative values Use N year avg factors to project future expected values for each iteration
34
Pros and Cons of adjusting heteroscedasticity using hetero-adjustment factors
Pro: can resample with replacement from entire triangle now Cons: adds parameters, affecting degrees of freedom and scale parameter
35
Exposure adjustments
Issue: Exposures changed significantly over the years (ie rapidly growing line or line in runoff) Adjustment If earned exposures exist, divide all claims data by exposures for each accident year to run the model with pure premiums After the process variance step, multiply back by AY exposures to get total claims
36
Parametric Bootstrapping
Purpose: Parametric bootstrapping is a way to overcome a lack of extreme residuals in a ODP bootstrapping model Steps Fit a parameterized distribution to the residuals Resample residuals from the distribution instead of observed residuals
37
Purposes of Bootstrap Diagnostics
Test the assumption in the model Gauge the quality if the model to fit the data Help guide adjustments of the model parameters to improve the fit of the model Purpose: Find a set of models and parameters that results in the most realistic and most consistent simulations based on the statistical features of the data
38
Residual graphs examples that help testing the assumption of IID
Residuals v Development Period -> look for heteroscedasticity Residuals v Accident Period Residuals v Payment Period Residuals v Predicted
39
AIC and BIC formulas
Smaller values indicate better fit More params penalized
40
How to identify Outliers
Box-whisker plot Whisker extend to the largest values within 3 times the inter-quartile range Values outside of whisker is outlier
41
Reviewing estimated unpaid model results
Standard error should increase from the oldest to most recent years Standard error for all years should be larger than any individual year CoV should decrease from oldest to most recent years due to independence in incremental payment stream If not, it may due to - Increasing parameter uncertainty in most recent years - Model may overestimate uncertainty in recent years, we may want to switch to BF/CC Min/Max simulations should be reasonable
42
Methods for combining results of multiple models
Run models with the same random variables: 1) Simulate r.v. for each iteration 2) Use same set of r.v. for each model 3) Use model weights to weight incremental values from each model for each iteration by accident year Run models with independent random variables 1) Run each model separately w different r.v. 2) Use weights to randomly select a model for each iteration by accident year so that the result is a weighted mixture of models
43
Estimated Cash Flow results
Simulation of unpaid losses by calendar year have the following characteristics: S.E. Of CY unpaid decrease as CY increase in the future CoV increases as CY increases
44
Estimated Ult LR results
Estimated ult loss ratios by AY are calculated using all simulated values, not just the future unpaid Represents the complete variability in LR for each AY LR distributions can be used for projecting pricing risk
45
Issues with correlation methods
Both location-mapping and re-sorting methods use residuals of incremental future losses to correlate segments Both tend to create overall correlations of close to zero For reserve risk, the correlation that is desired is between total unpaid amounts for two segments so there may be a disconnect
46
Correlation between segments: Location Mapping
For each iteration, sample the residuals from the residual triangle using the same locations for all segments Advantages: Method is easily implemented -> doesn’t require an estimate Disadvantages: All segments need to have the same size data triangles with no missing data Correlation of original residuals is used, so we can’t test other correlation assumptions