A.2. GLM Flashcards

Question 1

Q

GLM random component

Question 2

Q

GLM systematic component

Question 3

Q

Advantages of multiplicative rating plans

Answer

A

Simple and practical to implement.
They guarantee positive premiums (not true for additive
terms) .
Impact of risk characteristics is more intuitive.

Question 4

Q

Variance functions for exponential family distributions

Question 5

Q

Choices for frequency distributions

Answer

A

Claim frequency is most often modeled using a Poisson
distribution. The GLM implementation of Poisson allows
for the distribution to be continuous instead of discrete.
Technically, the overdispersed Poisson is recommended,
which allows the dispersion parameter to be different than 1, and thus allows the
variance to be greater than the mean (instead of being equal
as with the typical Poisson).
Another choice for frequency modeling is the Negative
Binomial distribution, which is really just a Poisson
distribution with a parameter that itself has a Gamma
distribution. With the Negative Binomial, f is restricted to 1,
but instead it contains a dispersion parameter k in its variance
function that allows for the variance to exceed the mean.

Question 6

Q

Choices for severity distributions

Answer

A

In insurance data, claim severity distributions tend to be
right-skewed and have a lower bound at 0. Both the Gamma
and Inverse Gaussian distributions exhibit these properties,
and as such are common choices for modeling severity. The
Gamma distribution is the most commonly used, but the
Inverse Gaussian has a sharper peak and wider tail, so it is
more appropriate for more skewed severity distributions.

Question 7

Q

Relationship between Poisson, Gamma, and Tweedie parameters

Question 8

Q

Logit and logistic functions

Question 9

Q

Why continuous predictor variables should usually be logged and exceptions

Answer

A

Continuous variables should usually be logged when a log
link function is used to allow GLMs flexibility in fitting
different curve shapes to the data (other than just exponential
growth).
Exceptions to the general rule of logging a continuous
predictor variable exist such as using a year variable to pick
up trend effects. Also, if the variable contains values of 0, an
adjustment such as adding 1 to all observations must first be
made since ln(0) is undefined.

Question 10

Q

Impact of choosing a level with fewer observations as the base level of a
categorical variable

Answer

A

This will still result in the same predicted relativities for that
variable (re-based to the chosen base level), but there will be
wider confidence intervals around the estimated coefficients.

Question 11

Q

Matrix form of a GLM

Question 12

Q

Degrees of freedom for a model

Answer

A

The degrees of freedom of a model is the number of
parameters that need to be estimated for the model.

Question 13

Q

GLM outputs for each predicted coeffi

Question 14

Q

How number of observations and dispersion parameter impact p-values

Question 15

Q

Model Building Process (10 Steps)

Question 16

Q

What does a p value represent

Answer

Study These Flashcards

A

An estimated probability that the absolute value of a particular b is at least that different from 0 by pure
chance

Question 17

Q

Solutions for addressing correlation amongst variables (2)

Answer

Study These Flashcards

A

Remove all highly correlated variables except 1 - can cause loss of important signal
Use dimensionality reduction techniques such as principal component analysis or factor analysis to create a new subset of variables from the correlated variables.

Question 18

Q

What is aliasing and what happens to the model?

Answer

Study These Flashcards

A

When 2 variables are perfectly correlated
The model does not converge

Question 19

Q

2 GLM Limitations

Answer

Study These Flashcards

A

GLMS give full credibility to data
GLMS assume randomness of outcomes are uncorrelated. (ie. same driver across multiple years)

Question 20

Q

Explain cross-validation

Answer

Study These Flashcards

A

Pick k number of folds, for each fold, train the other k-1 folds of data and test using the kth fold.

Question 21

Q

Advantages of modelling frequency and severity separately.

Answer

Study These Flashcards

A

Gain more insight and intuition about the impact of each predictor
F & S become more stable separately
PP modelling can lead to overfitting if a variable only impacts F or S and not the other
Tweedie model assumes both frequency and severity move in the same direction which may not be the case

Question 22

Q

Answer

Study These Flashcards

A

A.2. GLM Flashcards

(22 cards)