univariate and multivariate Flashcards

1
Q

3 main approaches for univariate

A
  1. Pure premium: calc PP for each level of RV and then divide by base level or overall PP to get indicated relativities
  2. Loss ratio: calc LR for each level of RV then divide by base level or overall LR to get indicated relativities
  3. Adjusted pure premium: adjust exposures by weighted average current relativity from other RVs and then use PPA using adjusted exposures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

main distortion in PPA

A

-main distortion in PPA=assumes no correlation between exposures for different RVs (distributional bias)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

distributional bias

A

if exposures of RV are correlated with exposures of the levels of another RV, then approach will double count experience of those levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

why is LRA improvement over PPA

A

because it attempts to correct for distributional bias -> uses premium instead of exposures and prem reflect higher levels of prem obtained within class as result of correlation with other higher rated levels of other RVs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

problem with APPA

A

calc weighted average relativities can be cumbersome in a rating plan with many variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

adjustments before applying univariate approaches

A
  • large events and anomalies: large losses and CAT should be removed and possibly replaced with some sort of longer-term loading
  • one-time changes: class data should be adjusted for all past one-time changes
  • continuous changes: trending of prem and loss is often ignored since common assumption=all classes are trending at same rate
  • development: often ignored since common assumption=all classes are developing at the same rate
  • expenses and profit: assumed to not vary by class so analysis often done using reported loss and sometimes ALAE; if FEs are material and separate expense fee is not using in rating algorithm, then relativities should be adjusted for FEs
  • credibility: individual classes have less data, they are less credible, so credibility weighting becomes more important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

credibility with PPA

A

credibility is applied to indicated relativity to total (compliment is normalized relativity = current relativity/total current relativity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

credibility with LRA

A

credibility is applied to rel change factor (compliment is no change aka change factor of 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

problem with univariate approaches

A
  • univariate does not properly account for impact of correlated variables
  • many variables in insurance are correlated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

multivariate analysis incorporates

A

impact of multiple variables simultaneously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

why have GLMs grown in popularity

A

as result of increased computing power, better data availability, and competitive pressure to avoid adverse selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

benefits of GLMs

A

Properly adjust for exposure correlations btw RVs

Attempt to focus on signal and ignore noise in data

Provide statistical diagnostics like CIs

Allows for consideration of interactions btw RVs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

exposure correlation

A

relationship btw 2 exposures of 2 or more RVs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

response correlation

A

when effect of 1 variable varies based on levels of another variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

minimium bias procedure

A
  • iterative univariate methods that properly adjust for exposure correlation
  • set total reported loss&ALAE for group = total prem you would obtain form that group with indicated relativities, ex:

total loss for t1=curr base rate*t1*sum(exposuresi *ci)

-start with seed values (based on univariate analysis) of class relativities and solve for territory relativities and repeat until convergence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

problem with minimum bias procedure

A

do not provide ways to test for whether variables are stat significant and they are computationally inefficient

17
Q

sequential analysis

A
  • sequence of using APPA on different variables
  • only class RMing allowed for CA personal auto
  • start with standard univariate analysis to obtain relativities for single variable and then perform APPA for second variable and so on
18
Q

sequential analysis problems

A
  • does not deal with exposure correlation
  • no closed form solution so results changes based on order of variables chosen
19
Q

GLMs seek to express

A

express formulaic relationship between multiple predictor variables and response variable such as PP, freq, or severity

  • link fct is often log link function which assumes RVs are multiplicative
  • GLMs not run on LRs because no need to OLP @ granular level
20
Q

important steps of GLM

A

-important steps: compile dataset with enough data for modeling, selecting link fct, specifying distribution of underlying random process, using maximum likelihood to calc parameters of model

21
Q

if noticeable difference between one-way and GLM results

A

suggests RV is correlated with other RVs and one-way analysis is not fully accounting for correlation

22
Q

if GLM gives counterintuitive results like higher deductible has higher relativity/costs more

A

could be symptom of limited data @ that level

do not implement result

23
Q

if 2 levels’ CIs have 1 in it

A

suggests there is no significant difference between the impact of those levels

24
Q

common diagnostics for GLMS

A
  • running model on separate consecutive time periods of data to see if estimated parameters are consistent over time
  • building model on 1 subset and then comparing predictions with actual results on second subset of data (holdout sample) -> can see if over or under fitting
25
Q

Data mining techniques

A
  • factor analysis: reduce number of variables needed in classification RM analysis
  • cluster analysis: combine similar risks into groups
  • CART: build if-then rules for use in classification
  • MARS: helps turn continuous variables into categorical
  • neural networks: training algorithms are given set of data and identify patterns -> help identify previously unknown interactions between variables