Robertson - ELFs for HGs Flashcards
NCCI publishes Excess Loss Factors (ELFs) to assist with
pricing excess policies and some retrospectively rated policies for WC
NCCI calculates ELFs at level
hazard group and state level for each possible limit
ELFs are the same for every class in a given hazard group within a state for a given limit
hazard groups are defined on
CW basis since NCCI believes that mix of injuries within a class should not vary much by state
in 2007, NCCI implemented new hazard group definitions and moved from
4 to 7 hazard groups
end goal of new analysis was to produce
new hazard group definitions at CW level which could later be used to produce new ELFs for every limit, state, and hazard group combo
-also required deciding for which limits to calculate ELFs, deciding on # of hazard groups, and deciding which classes go into each hazard group
to group classes with similar vectors into potential hazard groups, can use
L2 or Euclidean distance
-this would have advantage in that it minimizes the relative error in estimating excess premium
relative error in estimating excess premium for class c with limit L is
PLR*|RHG(L)-Rc(L|
NCCI performed a cluster analysis called ___ to group classes into hazard groups using premium weights
k-means algorithm
NCCI used cluster analysis to group classes with similar
vectors of excess ratios as measured by Euclidean distance into hazard groups
used __ clustering
non-hierarchical
which meant new hazard groups did not have to be subsets of existing groups and could instead represent best partition for given number of clusters
weighted k-means algorithm they used works as follows
- decide on # of clusters (potential hazard groups) k to target
- assign classes to k arbitrary groups
- calculate centroid of excess ratios of each group (essentially weighted excess ratios)
- for each class, find closest centroid using L2 distance
- move each class to group with closet centroid
- if any classes move go back to step 2 and continue
k-means clustering is equivalent to
to max R squared statistic in linear regression
it minimizes variance within and maximizes the variance between
which means hazard groups will be homogeneous and well separated
can think of R2 formula as
Trace(B)/Trace(T)
or
1-Trace(W)/Trace(T)
2 statistics used to decide on #HGs
- Calinski and Harabasz statistic
- Cubic Clustering Criterion (CCC) statistic
Calinski and Harabasz statistic
-measures between variance of clusters/within variance
[Trace(B)/(k-1)]/[Trace(W)/(n-k)]
n=#classes, k=#HGs
- higher values of this statistic indicate better # of clusters
- test also known as Pseudo-F test since it resembles F-test of regression analysis
*higher statistic indicates better clusters with higher between-cluster variance and lower within-cluster variance