Shapland Flashcards

1
Q

What’s the purpose of GLM model?

A

to model the incremental losses q(w,d)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to handle negative incremental value in triangle?

A
  1. If sum of the column is negative -
    • subtract from every cell in the triangle by the greatest negative value
    • set greatest negative cell to 0
    • solve the GLM using the modified triangle
    • add back to fitted incremental values the greatest negative value
  2. If just value is negative but the column sum is overall positive - see picture
    then solve the GLM
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to calculate Standardized Residual

A

Residual divded by Standard Deviation (aka. standard error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Model Result Reasonability check for Standard Error

A

Standard Error
- S.E. should increase from older to more recent years (bc s.e. follows the magnitude of the results)
- Total s.e. should be larger than any individual year s.e.
- Total s.e. should be less than the sum of s.e. across all AYs (since the model assumes independence btw AYs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Model Result Reasonability check for CoV

A

CoV
- CoV should GENERALLY decrease when moving from oldest to most recent years
- CoV may rise in the most recent year’s due to
1. Increasing parameters will bring in higher parameter uncertainty to more recent years
2. The model may be overestimating the variability in the most recent years
- Total CoV should be smaller than any individual year’s CoV
- Total CoV should be less than the sum of CoV across all AYs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why homoscedasticity is needed for bootstrapping

A

Bootstrap assumes residuals to be independent and identically distributed. Heteroscedasticity will violate this assumption since the variance of different residuals will be different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the solutions for heteroscedasticity?

A
  1. Stratified sampling
  2. Calculating Variance Params
  3. Calculating Scale Params
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How to handle heteroscedasticity with stratified sampling

A

Group development periods with homogeneous variances
Sample with replacement from the groups only
Disadvantage - some groups may have limited residuals thus reduced credibility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How to handle heteroscedasticity with Calculating Variance Parameter

A

Group development periods with homogeneous variances
Calculate s.e. of the standardized residuals in each of the “hetero” groups
Calculate the hetero-adjustment factor hi = (all std residuals combined)/(std residuals in group i)
Multiply all residuals in group i by the hi
Sample with replacement within ENTIRE triangle. Divide the resampled residuals by hi corresponding to the cell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to handle heteroscedasticity with scale parameters

A

Hetero-adj facors based on different scale params
use the ratio of SQRT(overall scale param) to SQRT(scale param by age i).

N = total cells in the triangle
p = alpha (AYs) + beta (development periods,usually alpha -1)+ #hetero adj factors
#hetero adj factors = #of hetero groups - 1
ni= # cells in group i

rw,d here is pearson (uncaled) residual, not standardized residual

Then hetero-adj factors are used the same way as in hetero-adj based on s.e. of residuals
see below for scale param for hetero group i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to handle Exposure Change in triangle

A

Divide all loss data by exposure for each AY to get Pure Premium
Run model based on pure premium
apply back exposure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to handle Heteroecthesious data in triangle (partial first development period data)

A

Partial first development period data -
reduce future incremental losses for the latest AY to correspond to the earned exposure
-> then simulate process variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to handle heteoecthesious data (partial last calendar period data)

A

Partial last calendar period data -

Annualize the last triangle so that they're in line with the rest of the triangle    Calculate the fitted triangle and residuals   During ODP bootstrap simulation, calculate and interpolate LDFs from the fully annualized sample triangles
 De-annualize last triangle
 Project future values by multiplying the interpolated LDFs with the new cumul values
 Reduce future incr values for the latest AY to remove future exposure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Formula for unscaled Pearson unscaled residual

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

General ODP Bootstrap

A
  • calculate the age-to-age factors
  • calculate the fitted cumulative losses by starting with the latest diagonal and backup the triangle
  • calculated the fitted incremental losses
  • calculate the actual incr loss
  • calculate the Pearson residuals
  • Calculate the hat matrix adj factors
  • calculate the standardized residuals
  • randomly sample from the standardized residuals with replacement
  • convert the random standardized residuals into sample incremental losses
  • use the sample incremental losses to create a triangle of sample cumulative losses
  • project the sample cumul losses to ultimate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

GLM model setup

A
  • set up below graph
  • once set up, fit the model to incr loss triangle using iterated least squares or maximum likelihood
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Formula for unscaled pearson residuals

A

note that z=1 for poisson (most of the time)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Formula for standardized residuals

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The power z is in estimated variance for each distribution

A

Possion z=1
Gamma z=2
Inverse Gaussian z =3

20
Q

Ways to handle Outliers

A
  • if extreme values exist in the original triangle, we can remove the impact from the model
  • if using the ODP bootstrap model, use below
    • exclude the outliers completely, treat as missing value
    • exclude the outliers from ATA factors and residual calculations, but include the outlier cells during the resample triangle projection process
21
Q

Three options to remove outliers when calculating ATA factors

A
  • exclude the row only if the outlier is in the numerator
  • exclude the row only if the outlier in the denominator
  • exclude the row if outlier is in either the numerator or the denominator
22
Q

What do we do if significant amount of outliers?

A

May indicate poor fit of model
For GLM bootstrap, choose new params/change the distribution
For ODP bootstrap, L-yr wtd avg can be used to provide a better model fit, but if skewness is real, then the bootstrap will keep it

23
Q

how to handle missing values?

A

GLM Bootstrap Model -
Missing data simply reduces the number of observations in the data

ODP Bootstrap Model -
estimate from surrounding values
or, modify LDFs to exclude missing values

Solution 1: estimate missing values from surrounding values
Solution 2: modify LDFs to exclude the missing value, no residual for missing value -> don’t resample from the missing value
Solution 3: if missing value is on the latest diagonal, estimate value/ use value in the 2nd to last diagonal
24
Q

Negative Values during simulation of process variance (aka. mw,d is negative), how to handle?

A

Option 1 :
- change the sign of simulated value
Option 2:
- shift the entire distribution to have a mean of mw,d

25
Q

Advantages of bootstrap model

A

Generates a distribution of possible outcomes as opposed to a single point estimate
— provides more info o potential results, can be used for capital modeling

Can be modified to the statistical features of data under analysis

Preserve the original data distribution (skewness etc)

26
Q

Reasons for more focus by actuaries on unpaid claims distributions

A

SEC is looking for more reserving risk information from publicly traded companies

Major rating agencies have dynamic risk models for rating and welcome input from company actuaries about reserve distributions

Companies use dynamic risk models for internal risk management and need unpaid claim distributions

27
Q

ODP Model overview

A

Incremental claims q(w,d) are modeled directly using a GLM

GLM structure:
Log link
Over-dispersed Poisson error distribution

Steps
1) Use the model to estimate parameters
2) Use bootstrapping (sampling residuals with replacement) to estimate the total distribution

28
Q

Simplified GLM model

A

Fitted (expected) incrementals using a Poisson error distribution are the same as incremental losses using volume-weighted average LDF

Simplified GLM Method

1) use cumulative claim triangle to calculate LDF
2) Develop losses to ultimate
3) Calculate the expected cumulative triangle
4) Calculate the expected incremental triangle from the cumulative triangle

29
Q

Bootstrapping BF and Cape Cod Models

A

With ODP bootstrap model, iterations for the latest few accident years can result in more variance than expected

BF Method
Incorporate BF model by using a priori loss ratios for each AY with standard deviations for each loss ratio and an assumed distribution

During simulation, for each iteration simulate a new a priori loss ratio

Cape Cod Method
Apply the Cape Cod algorithm to each iteration of the bootstrap model

30
Q

Generalizing the ODP Model: pros/cons of using fewer parameters

A

Pros
Help avoid potential over-parametrizing the model
Allows the ability to add parameters for calendar-year trends
Can be used to model data shapes other than data in triangle form

Cons
GLM must be solved for each iteration of the bootstrap model, slowing simulations
The model is no longer directly explainable to others using age-to-age factors

31
Q

Options to address negative incremental values during simulation

A

1) Remove extreme iterations from results
2) Recalibrate the model after identifying the sources of negative incrementals
Eg remove a row with sparse data when the product was first written
3) Limit incremental losses to zero
Replace negative incrementals in original data with zero

32
Q

How to treat non-zero sum of residual

A

Residuals calculated in a bootstrap model are just error terms, so they should be identically distributed with mean of zero (although not the case usually)

Non zero doesn’t mean incompatible with the true distribution

But to set the errors to zero, can add a constant from all residuals

33
Q

Using N year wtd average

A

With GLM:
Exclude the first few diagonals and only use N+1 diagonal to parameterize the model
Run bootstrap simulations and only sample residuals for the trapezoid that’s used to parameterize the model

With simplified GLM
Calculate N year avg LDFs
Run bootstrap simulation, sampling residuals for the entire triangle in order to calculate cumulative values
Use N year avg factors to project future expected values for each iteration

34
Q

Pros and Cons of adjusting heteroscedasticity using hetero-adjustment factors

A

Pro: can resample with replacement from entire triangle now

Cons: adds parameters, affecting degrees of freedom and scale parameter

35
Q

Exposure adjustments

A

Issue: Exposures changed significantly over the years (ie rapidly growing line or line in runoff)

Adjustment

If earned exposures exist, divide all claims data by exposures for each accident year to run the model with pure premiums
After the process variance step, multiply back by AY exposures to get total claims
36
Q

Parametric Bootstrapping

A

Purpose:
Parametric bootstrapping is a way to overcome a lack of extreme residuals in a ODP bootstrapping model

Steps
Fit a parameterized distribution to the residuals
Resample residuals from the distribution instead of observed residuals

37
Q

Purposes of Bootstrap Diagnostics

A

Test the assumption in the model
Gauge the quality if the model to fit the data
Help guide adjustments of the model parameters to improve the fit of the model

Purpose:

Find a set of models and parameters that results in the most realistic and most consistent simulations based on the statistical features of the data

38
Q

Residual graphs examples that help testing the assumption of IID

A

Residuals v Development Period -> look for heteroscedasticity
Residuals v Accident Period
Residuals v Payment Period
Residuals v Predicted

39
Q

AIC and BIC formulas

A

Smaller values indicate better fit
More params penalized

40
Q

How to identify Outliers

A

Box-whisker plot
Whisker extend to the largest values within 3 times the inter-quartile range
Values outside of whisker is outlier

41
Q

Reviewing estimated unpaid model results

A

Standard error should increase from the oldest to most recent years
Standard error for all years should be larger than any individual year

CoV should decrease from oldest to most recent years due to independence in incremental payment stream
If not, it may due to
- Increasing parameter uncertainty in most recent years
- Model may overestimate uncertainty in recent years, we may want to switch to BF/CC

Min/Max simulations should be reasonable

42
Q

Methods for combining results of multiple models

A

Run models with the same random variables:
1) Simulate r.v. for each iteration
2) Use same set of r.v. for each model
3) Use model weights to weight incremental values from each model for each iteration by accident year

Run models with independent random variables
1) Run each model separately w different r.v.
2) Use weights to randomly select a model for each iteration by accident year so that the result is a weighted mixture of models

43
Q

Estimated Cash Flow results

A

Simulation of unpaid losses by calendar year have the following characteristics:

S.E. Of CY unpaid decrease as CY increase in the future
CoV increases as CY increases

44
Q

Estimated Ult LR results

A

Estimated ult loss ratios by AY are calculated using all simulated values, not just the future unpaid
Represents the complete variability in LR for each AY
LR distributions can be used for projecting pricing risk

45
Q

Issues with correlation methods

A

Both location-mapping and re-sorting methods use residuals of incremental future losses to correlate segments
Both tend to create overall correlations of close to zero

For reserve risk, the correlation that is desired is between total unpaid amounts for two segments so there may be a disconnect

46
Q

Correlation between segments: Location Mapping

A

For each iteration, sample the residuals from the residual triangle using the same locations for all segments

Advantages:
Method is easily implemented -> doesn’t require an estimate

Disadvantages:
All segments need to have the same size data triangles with no missing data
Correlation of original residuals is used, so we can’t test other correlation assumptions