C.3 Multivariate Classification Flashcards

1
Q

Advantages and disadvantage of univariate analysis

A

Advantages: Simple to calculate and intuitive
Disadvantage: It doesn’t properly account for the impact
of correlated variables. This is of key importance as many
variables in insurance are correlated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Three reasons GLMs have grown in popularity

A
  1. Increased computing power
  2. Better data availability
  3. Competitive pressure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Benefits of multivariate methods (particularly GLMs)

A
  1. They properly adjust for exposure correlations between
    rating variables.
  2. They attempt to focus on the “signal” in the data (systematic effects) and ignore the “noise” (unsystematic effects).
  3. They provide statistical diagnostics (e.g., confidence
    intervals).
  4. They allow for the consideration of interactions between rating variables.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Advantage and disadvantages of Minimum Bias

procedures

A

Advantage: They properly adjust for exposure correlation.
Disadvantages: They do not provide ways to test for
whether variables are statistically significant and they are
computationally inefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe Sequential Analysis

A

To perform the analysis, first you perform a standard
univariate analysis to obtain indicated relativities for a single variable. Next, you perform the Adjusted Pure Premium Approach to obtain indicated relativities for a second variable, based on adjusting exposures as a result of the prior variable’s selected relativities. You then repeat the Adjusted Pure Premium Approach for all remaining variables, having adjusted for all prior variables at each step. Only one pass through the variables is done, and the method is not iterative.
While this method does deal with exposure correlation, the main criticism is that it doesn’t have a closed-form solution, meaning that the results change based on the order of variables that are chosen.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Some important steps in solving GLMs

A

Compiling a dataset with enough data for modeling, selecting a link function, specifying the distribution of the underlying random process, and using maximum likelihood to calculate the parameters of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why GLMs are usually run on frequency and severity

instead of loss ratios

A

There is no need to on-level premiums at the granular level, actuaries have a priori expectations of frequency and severity patterns but not loss ratio patterns, loss ratio models become obsolete when rates are changed, and there is no standard distribution for modeling loss ratios.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Some common GLM diagnostic tests

A

-Looking at standard errors (confidence intervals) around
estimates.
-Using Chi-Square tests, F-tests, and other deviance tests to choose between competing models with different variables.
-Running the model on separate consecutive time periods of data to see if the estimated parameters are consistent over time.
-Building the model on 1 subset of historical data, and then comparing the model’s predictions with the actual results on a second subset of historical data (known as a holdout sample). This can identify whether the model is over-fitting or under-fitting the original dataset.
-Judgmentally deciding whether the results seem
reasonable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How actuaries can play a key role in using GLMs

A

-Obtaining reliable data for use in modeling (i.e., GIGO:
Garbage In, Garbage Out).
-Exploring anomalous results in the GLM with additional
analysis.
-Considering model results from both a statistical and
business perspective.
-Developing appropriate methods to communicate the
model results based on the company’s ratemaking
objectives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Common types of external data used in GLMs

A

-Geo-demographic information: such as population density
-Weather data: such as average rainfall or number of days below freezing
-Property characteristics: such as square footage or quality of the local fire department
-Information about insured individuals or businesses: such as credit scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Some data mining techniques

A

-Factor Analysis: A technique to reduce the number of
variables needed in a classification ratemaking analysis. An example is the symbol variable in auto insurance.
-Cluster Analysis: A method to combine similar risks into
groups. An example is creating territories using zip codes.
-CART: Stands for Classification and Regression Trees. This can build a set of if-then rules for use in classification.
-MARS: Stands for Multivariate Adaptive Regression
Spline. This helps turn continuous variables into categorical variables.
-Neural Networks: Methods by which training algorithms
are given a set of data and identify any patterns. This
can help identify previously unknown interactions between variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly