18. Generalised Linear Modelling Flashcards

Question 1

Q

Explanatory variable

Answer

A

Inputs into a model expected to influence the response variable
- in pricing context, these would be rating factors
- for example, clinical and demographic drivers

May be categorical and non-categorical
- categorical –> value of each level are distinct and cannot be given natural ordering (e.g. gender - called factors)
- non-categorical - can take on numerical values (e.g. age) - often continuous numerical variables are treated as categorical variables

Question 2

Q

Response variable

Answer

A

Outputs likely to be affected by explanatory variable
- response is the value the model is trying to predict
- in pricing context, this would be the premium

Question 3

Q

Interaction terms

Answer

A

The effect of one factor varies depending on the value of another

Question 4

Q

One-way analysis

Answer

A

Looks at the effect on frequency and severity of each rating factors separately
ignores correlation and interaction effects between variables, so may underestimate, or double count effect of variables

Question 5

Q

GLM

Answer

A

Generalisation of normal model for multiple linear regression
can be used to model the behaviour of a random variable believed to be dependent on the values of several characteristics, e.g. age, gender, chronic conditions
Produces estimates of true values of relativities by taking account of correlations and allowing investigation of any interactions between variables present in the model

Overcome issues with normal model for multiple linear regression
- allows the response variable to take any distribution from the exponential family
- link function introduced that acts to remove the assumption that effects of different explanatory variables must simply be added together

Exponential family of distributions
- properties - distribution completely specified by mean, variance is a function of mean
- examples: normal, Poisson, Gamma, Binomial, inverse Gaussian, exponential, Tweedie

Tweedie distribution
- point mass at zero - aligns with pure premium distribution, large spike at zero, and wide range of amounts where policies have had claims

Link function
- must be differentiable and monotonic (strictly increasing or decreasing)
- log-link function results in a model where the effects of different rating factors are multiplied together
- logit

Question 6

Q

Normal model for multiple linear regression

Answer

A

assumes response variable has normal distribution
normal distribution has constant variance, might not be appropriate
adds together effects of different explanatory variables, often this is not what is observed
becomes long-winded with more than two explanatory variables

Question 7

Q

Type of GLM suited to model mortality

Answer

A

Logistic regression model
Logistic models model binary outcomes (0,1) well, and mortality is a binary outcome (dead or alive)
Link function would be the logit function, ln(mu/(1-mu))

Question 8

Q

Analysis of significance of explanatory variables (in explaining response)

Answer

A

check which variables are statistically significant against computer generated p-value
calculate AIC, F-tests, Chi-squared, or some other comparative measure
plot odds ratios, those with confidence levels above 1 are statistically significant
change in nested model deviance is significant

18. Generalised Linear Modelling Flashcards

(8 cards)