C.3. Multivariate Classification Flashcards
3 reasons GLMs have grown in popularity
- Increased computing power
- Better data availability
- Competitive pressure
Benefits of multivariate methods (particularly GLMs)
- Properly adjust for exposure correlations
- Focus on signal and ignore noise
- Provide statistical diagnostics (CIs)
- Allow for consideration of interactions between rating variables
Adv/disadv of minimum bias procedures
A: properly adjusts for exposure correlation
D: do not provide ways to test for whether variables are statistically significant, also computationally inefficient
Sequential analysis
Meh
Important steps in solving GLMs
Compiling dataset with enough data for modeling, selecting a link function, specifying distribution of underlying random process, and using maximum likelihood to calculate parameters of the model
Why GLMs are usually run of frequency and severity instead of loss ratios
No need to on-level premiums at granular level, a priori expectations of frequency and severity but not loss ration patterns, no standard distribution for modeling loss ratios, loss ratio models become obsolete when rates are changed
Common GLM diagnostic tests
Looking at CIs around estimates
Chi-square, F-tests, other tests
Running model on separate consecutive time periods of data to see if parameters are consistent over time
Building model on a subset of historical data and comparing prediction with actual
Judgmental decision
Actuaries’ role in GLMs
Obtaining reliable data (GIGO)
Exploring anomalous results in GLM with additional analysis
Considering model results from statistical and business perspective
Developing appropriate methods to communicate model results based on company’s ratemaking objectives
Common types of external data used in GLMs
Geo-demographic information
Weather data
Property characteristics
Information about insured individuals or businesses (i.e. credit scores)
Data mining techniques
Factor analysis (reduce number or variables needed) Cluster analysis (combine similar risks into groups) CART (classification and regression trees): if-then rules MARS (multivariate adaptive regression spline): turns continuous variables into categorical variables Neural networks: training algorithms to identify patterns