Section A - Class Ratemaking Flashcards

1
Q

AAA #1 Question: What basic principles should be present in a risk classification?

A

• Reflects expected cost differences – Among classes and other factors etc. • Distinguishes among risks on the basis of relevant cost-related factors – must relate to losses • Applied objectively – understandable rules • Practical and cost-effective – cannot be too costly or too difficult to use • Acceptable to the public – Public must feel it is fair

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

AAA #2 Question: Compare a Government insurance program to a private insurance program.

A

Answer: Similarities: Pooling of risks. Pools should be large enough to guarantee reasonable predictability of total losses. Differences: Government is provided by law, private is provided by contract. Government is usually compulsory, private is voluntary Government does not need to be self-supporting, private must support itself

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

AAA *** List the: Three Primary Purposes of Risk Classification

A

AAA—Risk Classification Ratemaking 1. PROTECT INSURANCE SYSTEM’S FINANCIAL SOUNDNESS Risk Classif ication is the Primary means to control adverse selection 2. BE FAIR Risk classif ication should produce prices ref lective of expected costs 3. Permit ECONOMIC INCENTIVES to operate and thus ENCOURAGE widespread COVERAGE AVAILABILITY A proper class system will allow an insurer to write and better serve both higher and lower cost risks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

AAA *** List the “Program Design Elements” (3) and how they relate to risk classification

A

[PEC]
1. DEGREE OF BUYER CHOICE:
Compulsory programs ~ Broad classif ication Voluntary programs ~ Ref ined classification

  1. EXPERIENCE BASED PRICING: To the extent prices are adjusted based on a risk’s emerging actual experience, less ref ined initial risk classification is needed
  2. PREMIUM PAYER: If premium is paid by someone other than insured then broad class systems may be appropriate since adverse selection is less likely
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

AAA ** Four Differences Between: Public vs. Private Insurance Programs

A
  • *PUBLIC** 1. Usually DEFINED BY LAW 2. COMPULSORY 3. Little role for COMPETITION 4. Little RELATION between long term BENEFITS and COSTS Typically insure ‘uninsurable’ risks (e.g. flood)
  • *PRIVATE** 1. DEFINED BY CONTRACT 2. Mostly VOLUNTARY 3. Rely on COMPETITION 4. RELATE COSTS to Benefits
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

AA ** Operational Considerations in Classification Rate Making (7)

A

MAMA ACE

  1. EXPENSE @ Cost of collecting info and pricing classes should not exceed benefits achieved
  2. CONSTANCY @ risk characteristics should be constant over time
  3. MEASURABILITY @ susceptible to convenient, reliable measure
  4. MANIPULATION @ minimize manipulation
  5. AVOID EXTREME DISCONTINUITIES @ especially at end points
  6. ABSENCE OF AMBIGUITY @ exhaustive, mutually exclusive classes that are clear and objective
  7. AVAILABILITY OF COVG @ properly classifying risks will maximize availability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

AAA *** Considerations in Designing a Risk Classification System (9)

A
  1. UNDERWRITING: Controls practical impact of class plan 2. MARKETING: Inf luences insurer’s mix of business
  2. PROGRAM DESIGN: Buyer degree of choice; experience based pricing; premium payer
  3. STATISTICAL: Homogeneity; credibility; predictive stability
  4. OPERATIONAL: Expense; constancy; measurable; maximize availability; absence of ambiguity; minimize manipulation; avoid extreme discontinuity
  5. HAZARD REDUCTION: Incentives to reduce hazard
  6. PUBLIC ACCEPTABILITY: Use relevant data; respect privacy; not unfairly differentiate among risks
  7. CAUSALITY: Demonstrable relation desirable, not always possible
  8. CONTROLLABILITY: Helps with hazard avoidance & public acceptance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

AAA *** Five Basic Principles of a Sound Risk Classification System

A
  1. Reflect EXPECTED COST DIFFERENCES
  2. DISTINGUISH among risks on the basis of COST RELATED FACTORS
  3. APPLIED OBJECTIVELY
  4. Practical and COST EFFECTIVE
  5. ACCEPTABLE TO THE PUBLIC
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

AAA * 3 Mechanisms for Coping with Risk

A
  1. HAZARD AVOIDANCE and reduction: not all risks can be avoided
  2. TRANSFER: gov’t assistance, self insurance groups, private insurance
  3. PUBLIC and PRIVATE Insurance Programs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

AAA * 3 Means of Establishing a Fair Price

A

FAIR PRICING METHODS:

  1. Reliance on WISDOM, INSIGHT, GOOD JUDGMENT Ignores actual experience
  2. OBSERVE THE RISK’S ACTUAL LOSSES over extended time Loss event may only occur once (life insurance)
  3. Observe GROUPS OF RISKS WITH SIMILAR CHARACTERISTICS and what their losses are
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Bailey & Simon: *** 3 Major Conclusions on the Actuarial Credibility of a Single Auto

A
  1. The Experience for 1 year for 1 CAR HAS SIGNIFICANT AND MEASURABLE CREDIBILITY
  2. In a HIGHLY REFINED rating class system which reflects inherent hazard, there will NOT BE MUCH ACCURACY in an individual risk merit rating plan, but where a wide range of hazards are encompassed within a class system, credibility is much larger
  3. CREDIBILITY DID NOT INCREASE LINEARLY with years of experience due to variation in the groups over time and skew

Adding a 2nd year increased credibility roughly 2/5 Adding a 3rd year increased credibility by another 1/6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Bailey & Simon: ** Four reasons Multi-year Credibility does not Grow Linearly from 1-yr Credibility

A

Multi-Year Credibilities are Not Linear because:

  1. Risks ENTERING AND LEAVING the class
  2. An Insured’s CHANCE FOR AN ACCIDENT changes over time
  3. RISK DISTRIBUTION of individual insureds may be SKEWED, reflecting various degrees of accident proneness 4. Credibility , defined as P/(P+K), does not grow linearly as P (cred in experience rating) increases [Hazaam review]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bailey & Simon: ** The Credibility of PPA Experience Rating depends on ______ and ______

A

Experience Rating Credibility Depends on:

  1. VOLUME of data in experience period
  2. Amount of VARIATION of individual hazards within the class vs. class rating where credibility increases in proportion to square root (Data Volume) only
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Bailey & Simon: ** Why is EPPR used instead of earned exposures as the FREQUENCY BASE for calculating credibility of a single PPA?
According to Hazam, what conditions must be met?

A

Use EPPR as a Base to AVOID THE MALDISTRIBUTION that results when higher claim frequency territories produce more X, Y, and B risks and also produce higher territory premiums. Basically to avoid overlap between territory rating and experience rating
To use EPPR as a Base for Eliminating misdistribution [Hazam]: 1. High Frequency territories must also be higher Premium territories 2. Territory Differentials are Proper Alternative: apply Bailey-Simon method to loss costs instead of loss frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Bailey & Simon: *** Single PPA Credibility:
Credibility =
Modification =
R =
m =

A

CARD #11
z = ( 1 - Mod) Class is n-yr claim free (1+,2+, 3+); R = 0 for claims free risks
z = (Mod - 1) / (R - 1) Group of Risks WITH claim experience (0,1,2)
Mod = Relative Freq = ZxR + (1-Z) =
= (# claims class/EPPR class) / (# claims tot / EPPR tot) R = Actual Freq / E[Freq] = 0 for accident free risks
= [1 - e^(-m)]^-1 if Freq Poisson distributed
E[Freq] is usually the class AVG freq
m = (#claims total)/ EE total)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Bailey & Simon:

How to determine which class has more stability:

  • Across time?
  • Between risks within the class?
A
**Stability Across Time:** 
Examine (n - yr Cred / 1-yr Cred) for each class
The more linear the multi -year credibility, the MORE STABLE (ratio = n) (i.e. The class with RATIO CLOSEST TO n is the MOST STABE)
Logic: If an insured’s chance for an accident remained constant from yr-to-yr and no risks entering/leaving, then credibility should vary in proportion to the number of years.
**Stability within a Class**:
Examine (n - yr Cred / freq per EE <sub>total</sub>) for each class
The LOWEST RATIO indicates the MOST STABLE individual risks, lowerst variation within its HG, or is the most narrowly defined/most homogeneous.
Logic: If the variation of individual insured’s chances for an accident were the same within
each class, credibility should vary in proportion to the average claim frequency.
*Recall: there are 5 classes shown; each class is divided into experience groups A, X, Y, B*
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Mahler 1 *
Credibility & Shifting Risk Parameters

Background for the paper’s analysis

A

Background:
Past Experience used to predict the future
New Estimate = Data*(z) + (Prior Est)*(1-z)
z = Credibility, Prior Est = Class AVG
Parameters may shift over time posing the question: How should we combine different years of historical data? Suggests:

  • Give Old years substantially less weight
  • May be minimal gain f rom using additional years of data

May want to vary the weights to prior estimates

  • No weight at all
  • Only look at 1 year back
  • Use multiple past years (weight equally, or vary the weights?)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Mahler 1 ***

Criteria for Evaluating Credibility Weighting

Schemes (3)

for weighting past experience against expected future

experience

A

3 Criteria for evaluating Credibility weighting:

  1. LEAST SQUARED ERROR (LSE): Minimizes the squared errorbetween observed and predicted results. The smaller the MSE the better the estimate. Buhlmann/Baysian credibility methods are LSE.
  2. SMALL CHANCE OF LARGE ERRORS: Minimizes the probability (p) that observed results will be greater than k% different from predicted.
    The smaller p is, the better the solution is. Classical/Limited fluctuation credibility technique:
    Pr (|A%E| < = k * E ) = (1 - p)
  3. CORRELATION: Minimizes patterns of errors (not concerned about large errors); Also called the Meyers/Dorwieller Method
    Statistic =
    Corr of [(Actual % / Predict %) ( Predict %/Grand Mean)]
    = Corr of [Modified Loss Ratio & Experience Modifier]

    Want the Corr as close to zero as possible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Mahler 1 *

What is the MAXIMUM REDUCTION in MSE

that can be attained by using credibility to

combine two estimates?

Optimal Credibility (Z) =

A

Lowest MSE possible = 0.75 * Min[MSE Z=1 , MSE Z=0]

Optimal Credibility (Z) = 1 - ( MSE Z=1 / MSE Z=0 )

20
Q

Mahler 1 **

Mahler 1 - Study of Credibility

  • Optimal Credibility Mahler discovered
  • Number Years used vs. Credibility
  • Impact of Delay in data availability
A

Mahler’s Various Findings:

  1. Optimal Credibility:
    Experience should usually be given 60%-80% credibility[per table 9]
  2. Number of Years vs. Credibility:
    The estimate gets worse as more years of data are used when parameters shift over time
  3. Impact of Delay
    The resulting credibility weighted estimate is less accurate; as delay increases, optimal credibility decreases
21
Q

Mahler 1 ***

3 Tests to see if parameters shift over time
(Three test Mhler uses to examine the winning percents of the teams)

What does each test reveal?

A
  1. BINOMIAL TEST - Fox (Q4)

Test if all teams win the same amount. Metric: Do team winning percents fall within a 95% confidence interval around the grand mean

  1. CHI@SQUARED TEST
    Test for shift over time in winning percent (or underlying parameters)
    Metric: Chi-Squared Statistic =
    =Σ[(observed - expected)2/Expected]
  2. CORRELATION TEST

Test for shift over time (same as Chi-Squared) Metric:

  1. Compute the correlation of league win% in one year vs. the prior year. Repeat for every lagged pair of years in history, and for lags =1 to n yrs (e.g. lag 1 yr: 2001:2001, 2002:2003, 2003:2004, etc..)
  2. Average these correlations across all similar lags (e.g. 1 yr, 2yr, …)
  3. If the average correlation decreases as lag period grows then conclude that parameters are shifting over time.
22
Q

Mahler 1 (mine)

Exponential Smoothing

A

X j+1 = z * Yj + (1-Z) * X j

X j+1 = Next Year estimate

Yj = Prior year actual result

X j = Priro year estimate

X 0 could be the grand mean

  • Predict X by having Yj*
  • Method leads to exponentially decreasing creadibilities assigned to the older year’s experience*
23
Q

Anderson ***

Failings of one-way analysis (2)

A

Failings of One-way Analysis:

1. Can be DISTORTED BY CORRELATIONS between rating factors
Youth are more concentrated in some territories: territory and age are correlated; one-way analysis overlooks this

2. Does NOT CONSIDER INPERDEPENDEMCIES
between factors
(aka: interactions)
Youth+High Performance Car = extra risky driver, but Elderly+High Performance Car = extra careful driver; impact of high performance vehicle changes across, or “interacts” with age

24
Q

Anderson ***

Failings of Minimum Bias (2)

A

Failing of Minimum Bias: Lack of statistical framework:

  1. Cannot test the STATISTICAL SIGNIFICANCE of a variable
  2. No CREDIBILITY RANGES for parameter estimates

Min Bias - Iterative calculations are considered computationally inefficient

25
Q

Anderson ***

Assumptions in Classical Linear Models (4)

A

Assumptions in Classical Linear Models

  1. All observations are INDEPENDENT
  2. Observations are NORMALLY DISTRIBUTED;
    ε ~ Norm( 0 , σ2 )
  3. Each component, or risk segment, has a COMMON VARIANCE
  4. The MEAN is a LINEAR COMBINATION OF COVARIATESAka: additivity of effects:e.g.: μ = β1x1 + β2x2 + β3x3 + ε
26
Q

Anderson ***

Limitations of Linear Models (4)

A

Limitations of Linear Models:

1. Difficult to assert Normality & Constant Variance for response variables
• Response variables often restricted to Positive Values; this violates the normality restriction

• If Y >= 0, then Var[Y] →0 as Mean[Y] →0
var is a function of the mean

  • *2. The Mean is not always a Linear Combination of Covariates**
  • many insurance risks tend to vary multiplicatively with rating factors*
27
Q

Anderson ***

Classic Linear Model Formula

  • versus -

Generalized Linear Model Formula

A
  • *Classic Linear:**
  • *Y = μ + ε = E[Y] + ε = X*β + ε**

Y = Random Component: independent and normally distributed. The means (μi) of each component (Xi) of Y may differ but they all have common variance (σ2)

X*β = Systematic Component: the covariates (X) combine to give the linear predictor η=X*β

g( ) = Link Function: Identity Link function used

ε = Error term: Norm(0, σ2)

Generalize Linear:
Y =μ + ε = E[Y] + ε = g-1(X*β) +ε

Ignoring offset term here

Y = Random Component: independent; from a

member of the exponential family

X*β = Systematic Component: the covariates (X)

combine to give the linear predictor η=X*β

g-1( ) = Link function: is differentable and monotonic

ε = Error term: various distributions possible

28
Q

Anderson ***

Generalized Linear Model Assumptions (7)

and how they relate to regular Linear Model assumptions

A
  1. No NORMALITY Assumption
  2. No CONSTANT VARIANCE Assumption
  3. No ADDITIVITY OF EFFECTS Assumption
    mean does not have to be a linear combination of covariates

Assumptions 1-3: Lines Model Assumptions that no longer apply

  1. Observation distribution is a member of the EXPONENTIAL FAMILY
  2. Observations are INDEPENDENT
  3. The ρ covariates are combined to give the linear predictor η = xβ
  4. The relationship between the random and systematic components is specified via a LINK FUNCTION, g-1( ), that is DIFFERENTIABLE and MONOTONIC.
    The link function can transform η into something non-linear
29
Q

Anderson ***

What Model Diagnostics can be conducted to

Evaluate the Factor Selection in a GLM (2)

A

β Parameter Tests

  • “Standard errors” (SE) for each parameter estimate can be defined using the multivariate version of the Cramer-Rao Lower Bound
  • Generally β’s are assumed to have an asymptotically Normal distribution so simple statistical tests can be applied to each level of a factor to test if its β is

significantly different from the base level (e.g. usually χ2 test)

Deviance Tests

  • Deviance (D) is a measure of how much the fitted values differ from the observations; basically a generalized form of the SSE
  • Can test between two “nested” models to see if the inclusion of the additional factor improves the model enough (i.e. decrease the deviance) given the extra parameter it adds to the model (adding any factor will improve the fit, but is the improvement significant)

Usually χ2 test used, but F test can be applied if scale parameter unknown

30
Q

Anderson ***

What is a Holdout sample

What is the purpose of a Holdout sample (2)

What are important selection criteria for the Holdout sample (3)

A

A training dataset is used to fit a model. A holdout sample is a separate sample of data not used in training. It is used to validate the results of the model.

Predictions from the training model are compared to predictions from the holdout model to:

  • Test how well the results of a predictive model generalize to other data; test how accurately the predictive model will perform in practice
  • Prevent over-fitting: this is particularly likely when (1) the size of the training data set is small or (2) when the number of parameters in the model is large.

Holdout Sample Selection Criteria:

  • Be an unbiased sample from the same population as the training dataset
  • Be large enough to fit to a model to
  • Can select randomly or split from training data by time (out of time sample)
31
Q

Anderson **

What is the Goal of the following Model Forms and what they each Suited for:

  • Generalized Linear modeling (GLM)
  • Principle Component Analysis (PCA)
  • Clustering
A

GLM: Fits a single response variable based on multiple linear predictors (parameterized method), a given error distribution, and a link function. Assumes a constant change in the predictors yields a constant change in the response. Prevents double counting of correlated signal among the predictors. Can run into difficulty if the predictors are too highly correlated (aliased). Intuitive method.

PCA: Primarily a dimension reduction or consolidating/grouping technique. Used when multiple predictors have common signal. The goal is to go from many (aliased) predictors, to a few new predictors that now house the bulk of the signal (variation). Suited for developing individual, aggregate variables that summarize signal.

Clustering: Grouping technique that clusters so that objects in the same group are more similar to each other than to those in other groups. Not a linear technique. Usually applied when there are multiple targets (criteria) to consider and grouping is the goal. Does not return individual point estimates.

32
Q

Anderson ***

What is Aliasing

List and describe the Three Types of Aliasing

A

Aliasing (Between Factors)

Aliasing occurs when there is a LINEAR DEPENDENCY among the factors (covariates) in the model; one factor or factor level is identical to another factor or factor level, or to a combination of other factors/levels.

INTRINSIC: Occurs because of dependencies INHERENT IN THE DEFINITION of the factors (like age & birth year)

EXTRINSIC: Occurs from a dependency among factors when the dependency is from the NATURE OF THE ACTUAL DATA rather than inherent properties of the covariates (e.g. in your data all red cars just happen to be 2-door sedans as well)

NEAR-ALIASING: Occurs when two or more factors are almost but NOT QUITE PERFECTLY CORRELATED; convergence problems can occur when present

33
Q

Anderson **

Which Variable Types are more prone to GLM Aliasing problems?

What Model Specifications make Aliasing occur?

How can a Model be Specified to avoid Aliasing?

A

Linear dependencies among factor levels commonly arises when Categorical Factors (vs. continuous variables) are included in the model.

If the model has a Parameter for each Factor Level of a variable there WILL BE a linear dependency between the factor levels [aliasing], and the model will not be uniquely def ined.

To uniquely def ine the model the aliasing needs to be removed: one Level of each Factor should have no parameter associated with it - the ‘base’ level; when none of the other levels are indicated then the base effect will apply. This creates a model with an Intercept term.

Example:

factor X with 4 levels that are either 0 or 1: x1, x2, x3, x4

Linearly dependent: ƞ =β1x1 + β2x2 + β3x3 + β4x4

Aliased Predictor:

ƞ =β1x1 + β2x2 + β3x3 + β4*(1 - x1 - x2 - x3)

34
Q

Anderson (mine)

Explain what does Model Underfit and Overfit mean?

A

Underfit <=> too few parameters <=> Does not use enough of the useful info <=> Does not capture enough of he signal

Overfit <=> Too many parameters <=> Reflects too much of the noise

35
Q

Robertson **

Describe the 1993 NCCI Hazard Group study

A

**1. Identified Seven Variables **- ratios of class average to state average - indicative of excess loss potential:

  • *1.** Serious to Total claim frequency ratio
    5. Serious Severity, including medical
  • *2**. Serious Medical severity
    6. Serious Medical to Total Medical PP
    3. Serious to Total indemnity PP ratio
  • *7.** Serious Pure Premium to Total PP ratio
    4. Serious Indemnity severity

2. Grouped Ratios into three subsets based on examination of correlation matrix

3. Used Principle Components analysis to find a single representative variable from each subset (1, 2, 7); developed a predictive linear combination of them

  1. Determining the optimal number of HG was out of scope.

5. Class Credibility based on z = min (1.5*n/(n+k), 1)
Gave classes with average claim volume 75% credibility, and classes with twice the average 100%.

  1. Over 96% of premium was concentrated in two Hazard Groups: II (45.6% prem) and III (51.1% prem).
36
Q

Robertson ***

Summarize the Process used in the

2007 NCCI Hazard Group Mapping Study

A
  1. Developed Excess Loss Ratios for each Class at f ive selected Limits
    • Excess Ratios were Credibility Weighted with the Current HG excess ratio
    • Based on Five Years of data (2000 - 2005)
  2. Grouped Classes with similar unstandardized, Credibility weighted Excess Ratio vectors using weighted, k-means Clustering Analysis
  3. Enhanced Cluster/Grouping using Principal Components analysis
    Allowed projection of five-dimensional plot onto two dimensions; showed that clusters were well separated and outliers were easily identified
  4. Determined Optimal Number of Groups (7) using weighted, k-means cluster analysis
  5. Had Underwriter panel review the initial groups; revised groupings based on their input
37
Q

Robertson **

How many Excess Loss Factors (Limits) were Used to Sort the Classes into Hazard Groups?

Why was This Number selected? (5)

A

NCCI Selected Five Limits: 100k, 250k, 500k, 1000k, 5000k

  1. NCCI publishes excess loss factors (ELFs) for 17 different limits
  2. Five limits were selected primarily on two considerations:
  • ELFs at any pair of limits are highly correlated
  • limits below $100,000 are heavily represented in the list of 17 limits
  1. The 12 limits not used had correlation coefficients of at least 0.9882
  2. Including all 17 limits only affected the HG assignment of 5.5% of the premium
  3. Including only one limit resulted in markedly different results depending on the limit (e.g. 100k vs 1,000k) indicating too much information was lost
38
Q

Robertson

A
39
Q

Robertson

A
40
Q

Robertson

A
41
Q

Robertson

A
42
Q

Robertson

A
43
Q

Robertson

A
44
Q

Robertson

A
45
Q

Robertson

A