W&M Ch 10 Flashcards
Briefly describe the shortcoming of univariate approaches
They do not accurately take into account the effect of other rating variables
Identify the circumstances that led to the adoption of multivariate techniques
- Computing power
- Data warehouse initiatives
- Competitive pressure
Identify the benefits of multivariate methods
- Adjust for exposure correlations
- Allow for nature of random process
- Provide diagnostics
- Allow interaction variables
- Considered transparent
Briefly describe four reasons an actuary may prefer to model on Loss Cost Data rather than Loss Ratio
- Modeling loss ratios requires premium @CRL which can be difficult at granular level
- Experienced actuaries have an a priori expectation of frequency and severity patterns
- -in contrast, loss ratio patterns dependent on current rates
- -actuary can better distinguish signal from noise
- Loss ratio models become obsolete when rates and rating structures are changed
- No commonly accepted distribution for modeling loss ratios
You are modeling driver age for personal automobile bodily injury. The results of a univariate analysis and a multivariate analysis are significantly different. Explain.
Disparity suggests age is strongly correlated with another variable in model
- e.g. prior accident experience, use of auto
- univariate results are distorted
Briefly describe the benefits of statistical diagnostics with GLMs
Aid modeler in understanding certainty of results and appropriateness of model. Some can help determine if predictive variable has a systematic effect on insurance losses and others assess modeler’s assumptions around the link function and error term.
Briefly describe four statistical diagnostics used with GLMs
- Standard errors
- -narrow standard errors suggest variable is statistically significant
- -wide standard errors, often around 1.0, suggest factor detecting mostly noise, and should eliminate from model
- Deviance tests
- -measure how much fitted values differ from observations
- -deviance of models compared to assess whether the additional variables in a broader model are worth keeping
- Consistency with time
- -compare results from individual years
- -gauge consistency of results from one year to the next
- Validation
- -one option to compare expected outcome of the model with historical results on a hold-out sample of data
- -considerable differences between actual and expected may indicate model is over or under-fitting
Briefly describe over-fitting a model
Over-fitting results when variables in model reflect noise or over-specify model with high order polynomials
- Replicates historical data well but doesn’t project future reliably
- -future experience unlikely to have same noise
Briefly describe under-fitting a model
Under-fitting a model is omitting statistically significant variables
-Model doesn’t have enough explanatory power
Briefly describe seven important areas that the actuary needs to consider when using GLMs
- Ensuring data is adequate for level of detail of the classification ratemaking analysis
- Avoiding GIGO principle - Garbage In, Garbage Out - Developing appropriate methods to communicate model results
- Considering company’s ratemaking objectives - Commercial considerations
- IT constraints
- Marketing objectives
- Regulatory requirements - Identifying when anomalous results dictate additional exploratory analysis
- Reviewing model results in consideration of both statistical theory and business application
- Retrieval of data requires careful consideration
- Volume of data
- Definition of homogeneous claim types
- Method of organization (e.g. policy vs accident year)
- Treatment of midterm policy changes
- Large losses
- Underwriting changes during experience period
- Effect of inflation and loss development - Always must balance stability and responsiveness
- Choice of experience period and geographies
Briefly describe four actions the actuary should take to successfully use GLMs in the ratemaking analysis
- Have solid background in company’s data warehouses
- Develop some understanding of statistical methods and diagnostics
- Work collaboratively with other professionals who know portfolio of business
- Communicate effectively with stakeholders of company to ensure the technical results are expressed in relation to company’s business objectives
Briefly describe four ways data mining techniques can be used to enhance a ratemaking analysis
- Shorten long list of potential explanatory variables to use in GLM
- Provide guidance in how to categorize discrete variables
- Reduce dimension of multi-level discrete variables
- Identifying candidates for interaction variables within GLMs by detecting patterns of interdependency between variables
Identify five data mining techniques and briefly describe their use to enhance the underlying classification analysis
- Cluster Analysis
- Seeks to combine small groups of similar risks into larger homogeneous categories
- Targets minimizing differences within a category and maximizing difference between categories - CART (Classification and Regression Trees)
- Develop tree-building algorithms to determine a set of if-then logical conditions
- Help improve classification and detect interactions between variables
- Helps identify strongest list of initial variables and how to categorize each - Factor Analysis
- Reduce number of parameter estimates in classification analysis
- May reduce number of variables or levels within a variable - MARS (Multivariate Adaptive Regression Spline)
- Multiple piecewise linear regression where each breakpoint defines region for a particular linear regression equation
- Use to select breakpoints for categorizing continuous variables - Neural Networks
- User gathers test data and invokes training algorithms designed to automatically learn structure of the data
- Results of neural networks can be fed into GLM
Identify and give an example of each of four types of external data sources used to supplement company data to be used with GLMs
- Geo-demographics
- e.g. population density of an area, average length of home ownership in an area - Weather
- e.g. average rainfall, number of days below freezing in an area - Property characteristics
- e.g. square footage of home or business, quality of fire department in area - Information about insured individuals or business
- e.g. credit info, occupation