A.2. GLM Flashcards
GLM random component
GLM systematic component
Advantages of multiplicative rating plans
- Simple and practical to implement.
- They guarantee positive premiums (not true for additive
terms) . - Impact of risk characteristics is more intuitive.
Variance functions for exponential family distributions
Choices for frequency distributions
Claim frequency is most often modeled using a Poisson
distribution. The GLM implementation of Poisson allows
for the distribution to be continuous instead of discrete.
Technically, the overdispersed Poisson is recommended,
which allows the dispersion parameter to be different than 1, and thus allows the
variance to be greater than the mean (instead of being equal
as with the typical Poisson).
Another choice for frequency modeling is the Negative
Binomial distribution, which is really just a Poisson
distribution with a parameter that itself has a Gamma
distribution. With the Negative Binomial, f is restricted to 1,
but instead it contains a dispersion parameter k in its variance
function that allows for the variance to exceed the mean.
Choices for severity distributions
In insurance data, claim severity distributions tend to be
right-skewed and have a lower bound at 0. Both the Gamma
and Inverse Gaussian distributions exhibit these properties,
and as such are common choices for modeling severity. The
Gamma distribution is the most commonly used, but the
Inverse Gaussian has a sharper peak and wider tail, so it is
more appropriate for more skewed severity distributions.
Relationship between Poisson, Gamma, and Tweedie parameters
Logit and logistic functions
Why continuous predictor variables should usually be logged and exceptions
Continuous variables should usually be logged when a log
link function is used to allow GLMs flexibility in fitting
different curve shapes to the data (other than just exponential
growth).
Exceptions to the general rule of logging a continuous
predictor variable exist such as using a year variable to pick
up trend effects. Also, if the variable contains values of 0, an
adjustment such as adding 1 to all observations must first be
made since ln(0) is undefined.
Impact of choosing a level with fewer observations as the base level of a
categorical variable
This will still result in the same predicted relativities for that
variable (re-based to the chosen base level), but there will be
wider confidence intervals around the estimated coefficients.
Matrix form of a GLM
Degrees of freedom for a model
The degrees of freedom of a model is the number of
parameters that need to be estimated for the model.
GLM outputs for each predicted coeffi
How number of observations and dispersion parameter impact p-values