A5. NCCI's 2007 hazard group mapping Flashcards

1
Q

Why the NCCI moved from 17 to 5 limits

A
  1. *** ELFs at different pairs of limits were highly correlated across classes
  2. Limits below $100,000 were too heavily represented in the 17 limits
  3. The range of limits commonly used for retrospective rating are well represented with as few as 5 limits
  4. Using only 1 limit would not have been enough to capture the full variability* in XS ratios
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Summarize the process used in the NCCI 2007 HG mapping study

A
  1. Developed vectors of XS ratios at 5 selected limits for each class
  2. Stabilized data using credibility weighting of vectors of XS ratios
  3. Grouped classes with similar vectors of XS ratios using weighted k-means
  4. Enhanced groupings outliers and validation using PCA
  5. Determined optimal number of groups (7) using Calinski-Harabasz criterion
  6. Revised the 7 groupings based on the inputs of an underwriter panel review
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

step 1: What adjustments did Roberson do to loss data before calculating the XS ratios

A
  1. Trended/developped
    - because trend/development vary by layer
    - need to adjust for trend/development because XS ratios are used to price futur policy
  2. On-level
    - need to adjust for futur benefits level
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

step 2: Credibility formula to weight the XS ratios with the current HG XS ratio

A

Credibility formula retained by NCCI:

  • Z=Min(n/(n+k) *1.5; 1)
  • Advantage: gives more weight to larger classes
  • Disadvantage: not appropriate for highly skewed distribution

Other options considered:

  1. using square root( Z=sqrt(n/384)), which gives 95% chance that n is within 10% of its expected value)
  2. using median instead of avg for k
  3. excluding med only claims in analysis
  4. including only serious claims in analysis
  5. requiring a minimum n for classes used in the calculation of k
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Purpose of standardization

WHAT DOES IT PREVENT ?
WHEN APPROPRIATE ?

A

Purpose:

  • ***when vectors of variables have different units and spreads **
  • this prevents a variable with large values from exerting undue influence on results

Appropriate: if the spread of values is due to normal random variation

Not appropriate: if the spread of values is due to presence of sub-classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Standardization of the XS ratios before applying clustering

A

Standardization retained by NCCI:
-none

Other options considered:

  1. XS ratio* = (XS ratio - mean)/std dev
  2. XS ratio* = (XS ratio - min)/(max - min)

Reasons to retain no standardization:
1. the results of the analysis were not sensitive to standardization

  1. loss of interpretability/common unit of measure
  2. loss of common range (XS ratio: between 0 and 1, XS ratio*: between -inf to +inf)

**4. loss of information for lower limits because XS ratios at lower limits are actual data/more volatil and XS ratios at higher limits are based on fitted distributions/less volatile
(XS ratio: more weight to lower limits and less weight to higher limits, XS ratio*: less weight to all limits)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Clustering algorithm

A

Definition:
-method of grouping similar risks based on a target metric

avantage (1)

  • minimizes variance withing groups
  • maximizes variance between groups

Advantage (2)

  • allows the grouping of homogenous risks
  • **-achieves greater credibility for each group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Types of clustering algorithms

A
  1. Non hierarchical clustering:
    -produces optimal clusters that best correlate with expected cost
    seeks to create the best k cluster regarless of what the best k-1 cluster were
    no subject to parent child constraints
    -Disadvantage: the groups might be less intuitive => could result in neighboring groups having significant different rates*
  2. Hierarchical custering:
    - any additional cluster is a subset/nested version of an existing cluster
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

step 3: Measure to calculate distance between vectors of XS ratios in the clustering algorithms

A

(1) L2/euclidean distance
- Penalyse large deviation between class and HG excess
- give undue weight to outliers
- better if there is a skwed distrivution

(2) L1 distance
- Advantage:
- doest penalyze large deviation more than the sum of many small deviation *
- **minimizes relative error in estimating the excess premium between hg and class

- unit of L1 is the
* same as expected loss costs ***( $) while L2 has unit = $^2
- Disadvantage: more difficult to solve

Reason to retain the L2 distance:

  • the results of the analysis were not sensitive to the choice of L2 vs L1
  • therefore they retained the easiest one to solve
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Advantages of k-means (optimality properties)

A

Equivalent to maximizing the R2 from linear regression: *maximizes the % of total variation explained by the hazard groups

  • minimizes the variance within hazard groups => homogeneous HG
  • maximizes the variance between hazard groups => well separated HG
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

step 4 : Advantages of PCA

A

PCA identifies variables most predictive of the outcome:

  • Allows to eliminate other correlated variables to simplify the model
  • -Allows to identify outliers and validate clusters separation visually
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

step 5: Tests statistics used to decide on the optimal # k of hazard groups

  • HOW CCC WORKS
  • WHY NOT RETAINED
A

(1) Cubic clustring criterion (or CCC)
- **Compares amount of variance explained by a given set of clusters to that expected when clusters are formed at random based on multidimensional uniform distribution ***
- higher is better
- indicated 9 hazard groups

Reasons to retain (2) Calinski-Harabasz

  1. Calinski-Harabasz outperforms CCC according to Milligan&Cooper paper
  2. CCC is less reliable if there is correlation between variables which is the case of XS ratios
  3. CCC also indicated 7 hazard groups if the small classes were excluded from the analysis
  4. CCC 9 hazard groups had some crossover between the XS ratios which is not appealing in practice ***
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

step 6: Underwriter considerations

A
  1. similarity * between classes* that were in different hazard groups => makes sense to assign them to the same hazard group
  2. degree of exposure to accidents in a given class => makes sense to move to a higher hazard group if there is a high degree of exposure, even if not reflected in past experience
  3. extent to which heavy machinery is used in a given class => makes sense to move to a higher hazard group if there is a lot of heavy machinery, even if not reflected in past experience
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why NCCI defined hazard groups on a country-wide basis, and not by state

A
  • HG is a collection of classes that have similar ELFs over a wide range of limits
  • NCCI view is based on homogeneous operations
  • Therefore the mix of injuries within a class should not vary between states

ELFs are the same for every class in a given hazard group within a state for a given limit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

why the new hazard groups were superior to the prior hazard groups.

A
  • The excess ratios for the new hazard groups were well separated compared to the prior hazard groups, meaning that there was less overlap in the **excess ratios between classes in different hazard groups. ***
  • A more sophisticated method was used to determine the optimal number of hazard groups. The NCCI used the Calinski and Harabasz and CCC statistics to decide on using 7 hazard groups.
  • A more sophisticated method was used to group classes into hazard groups by using the k-means algorithm.
  • ***The new hazard groups had a more even concentration of classes and premium, while previously classes were primarily concentrated in 2 of the 4 hazard groups.

**• The prior analysis used proxy variables to measure excess loss potential, while the new analysis used excess ratios directly.

***• The new analysis used excess ratios by injury type, while the prior analysis treated all injury types the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

K MEANS ALGO

A
  1. Assign classes to k arbitrary groups
  2. Calculate centroid of xs ratios of each group
  3. Compare excess ratio of each group with closest centroid
  4. move each class to group with closest centroid
  5. if any calsses move, go back to step 2 and continue