Modelling and GLMs Flashcards

1
Q

What are the shortfalls of the formula approach to pricing?

A

Does not allow for:
* The timing of cashflows
* The separation of premium and claim-related cashflows
* Variation in assumption over time
* The accumulation of reserves
* Capital needs
* The effect of net negative cashflows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the different types of business a model can model?

A
  • New business model: expected cashflows projected for new business.
  • Existing business model: expected cashflows projected for existing business.
  • Full model office: expected cashflows projected for new and existing business.
  • Single profit test model: expected profit flows projected for single policy from date of issue
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When is new business model useful?

A
  • Sales Projections: Estimate the number of new policies expected to be sold in each region over the next five years.
  • Revenue and Cost Projections: Project the aggregate cash inflows (premiums from new policies) and outflows (claims, marketing costs, administration costs).
  • Capital Requirements: Determine the additional capital needed to support the growth in new business, considering regulatory capital requirements.
  • Return on Capital: Calculate the expected return on capital from new sales to ensure it meets the company’s target thresholds.
  • Goodwill Calculation: Estimate the present value of future profits from new business, contributing to the company’s overall appraisal value.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a model office useful for?

A
  • Comprehensive Financial Projection: Integrate the cash flows and profit projections from both existing and new business portfolios over a long-term horizon.
  • Strategic Decision Analysis: Assess the impact of the merger on future financial performance, including synergies in cost reduction and revenue enhancement.
  • Embedded Value and Solvency: Evaluate the combined embedded value and solvency position post-merger.
  • Scenario Testing: Run different scenarios to understand the implications of various strategic decisions (e.g., changes in underwriting practices, expansion into new markets).
  • Regulatory and Capital Impacts: Ensure the merged entity meets all regulatory requirements and assess the impact on capital adequacy.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is existing business model useful for?

A
  • In-Force Business Valuation: Calculate the embedded value, which is the present value of future profits from the existing policies.
  • Solvency Testing: Assess the company’s ability to meet its obligations under various stress scenarios (e.g., high claims due to an epidemic).
  • Surplus Analysis: Analyze the surplus generated by the existing business, identifying sources of profit or loss (e.g., lower claims than expected).
  • Reserve Adequacy: Ensure that the reserves held for future claims are adequate based on the latest experience.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is single profit test model useful for?

A
  • Assumptions: Setting assumptions about future claims, mortality, morbidity, expenses, lapses, and investment returns specific to this demographic.
  • Cash Flow Projection: Estimating the cash inflows (premiums) and outflows (claims, administrative expenses) over the policy term.
  • Profit Testing: Calculating the net present value (NPV) of future cash flows to ensure the premium covers the expected costs and provides a target profit margin.
  • Sensitivity Analysis: Testing the sensitivity of profits to changes in key assumptions (e.g., higher than expected claims) to ensure robustness.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the general model requirements?

A

Story: Naomi Campbell was undocumented in the UK. So she got a concession because her nose profile was so perfect, and was well represented in her modelling. She has other features too like incredibly long arimpit hairs, which she put to work by braiding. Her braiders would smoke big joints and they were all verified on instagram and twitter. Naimoi would make them do super intricate and complex, but still were able to be refined if needs be e.g. on red carpet.

  • The model being used must be valid, rigorous and adequately documented
  • The model chosen should be capable of reflecting the risk profile of the financial products being modelled
  • The parameters used must allow for all features of the business being modelled
  • The workings of the model should be easy to appreciate and communicate
  • The model should exhibit sensible joint behaviour of model variables
  • The outputs from the model should be capable of independent verification for reasonableness and should be communicable
  • The model must not be overly complex so that either the results become difficult to interpret / communicate or the model becomes too long or expensive to run
  • The model should be capable of subsequent development and refinement
  • The inputs to the parameter values should be appropriate to the business being modelled
  • A range of methods of implementation should be available to facilitate testing, parameterization and focus of results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the requirements of a health insurance model?

A

Naomi’s daughter needed to count all the cash she made for spitting flows at Victoria secret. This included all cash she was going to give to her daughter who had been a reserve the whole netball season. She wasn’t very good at interacting except with her friend AL, who also understood the economic conditions of Naiomis new flow business and claim to fame. One day the two tried to blow up dynamite but didn’t put enough margarine on their hands so had all this solvent on their hands and couldn’t solve why dynamite didn’t go off. Then they saw some smoke and got super stoked but it was just a simulation.

  • Allow for all the cashflows that may arise (for LTC project cashflows in different states seperately)
  • Allow, for the cashflows arising from any supervisory requirement to hold reserves
  • Ensure that adequate margin of solvency is maintained
  • Cashflows need to allow for any interactions, particularly where the assets and the liabilities are being modelled together
  • Should be dynamic
  • Think about economic conditions correlated with new business volumes, renewal experience and claims experience
  • The ability to use stochastic models and simulation needs to be allowed for, where appropriate, e.g. to simulate the possible distribution of claims outgo
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the different types of sensitivities.

A
  • Sensitivity to the choice of model points or the parameter values.
  • Sensitivity testing when pricing
    • Used to assess what margins need to be incorporated into the parameter values
    • If profitability is overly sensitive to a factor -> redesign / include a greater margin
  • Sensitivity testing when reserving
    • If reserving assumptions need to be prudent, sensitivity analysis can be used to determine these margins
    • Can also be used to assess the need and extent of any additional risk margins, global reserves or capital requirements to cover future potential adverse experience
  • Sensitivity testing when assessing return on capital / profitability
    • Will enable the actuary to quantify the effect of departures from the chosen parameter values when presenting the results of the model to the company.
    • Where a prob dist can be assigned to a parameter, it may be possible to derive analytically the variance of the profit or return on capital.
    • A sensitivity / scenario analysis can be carried out such as at certain confidence intervals of the distribution.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are GLMs useful for?

A
  • GLMs are a generalised for of linear regression
  • Useful for determining the relationship between a response variable and set of explanatory variables
  • Useful for pricing, financial projections and overall modelling of the business.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the different types of variables in a GLM?

A
  • Explanatory variables which are expected to influence the response variables (can be categorical like gender or non-categorical like age)
  • Response variable is output from the model.
  • Interaction term is included when the response variable is better modelled by interaction between explanatory variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the benefits of GLM over one-way analysis?

A
  • Handles risk cells with small volumes well, as it uses the full data
  • More stable transitions between levels of risk:
  • Gives control over interactions considered between variables
  • Can easily assess different combinations of explanatory variables:
  • Accounts for the effects of other explanatory factors in calculation of effect sizes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the pitfalls of GLM over one-way analysis?

A
  • If influential points affect coefficients, the impact spreads beyond the single cell that the influential point lies in
  • Potential for model error if not specified correctly
  • Requires some statistical understanding to be able to use:
  • Might not capture correct shapes of relationships
  • One-way views may call out areas of concern that might not otherwise be detected if just looking at GLM outputs:
    Example: In a one-way analysis, you might notice an unusually high average claim amount for a specific risk cell (e.g., females aged 40-45 with a specific pre-existing condition). This observation could prompt further investigation into the underlying reasons for the high claim amount. Such insights might be missed if solely relying on the overall GLM outputs, which provide a higher-level view of the relationships between risk factors and the response variable.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the assumptions of classic linear models?

A
  • The error terms are independent and come from a normal distribution
  • The mean is a linear combination of the explanatory variables
  • The error terms have a constant variance (homoscedasticity)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are some drawbacks of classic linear regression?

A
  • Assumes that the response variable, Y, has a normal distribution, which may not be appropriate for the variable being modelled
    • Claims tend to be positively skewed + normal distribution can take on negative values
  • The normal distribution has a constant variance, which may not be appropriate for the variable being modelled
    • Variance of claim numbers tend to increase as the expected value increases → Poisson distribution has this property
  • Adds together the effects of the different explanatory variables, but this is seldom what is observed in practice
    • e.g. effects of “age” and “family size” may be multiplicative
  • With more than two explanatory variables, a manual solution becomes increasingly long-winded
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do GLMs address drawbacks of classic linear regression?

A
  • They generalise the normal model for multiple linear regression in two main ways
    • The response variable can take any distribution from the exponential family - no longer has to take a normal distribution
    • The link function is introduced - this acts to remove the assumption that the effects of different variables must simply be added together. Link function can take into account multiplicative nature of explanatory variables and their effects and transform them into linearity.
  • Variance of response variable, is a function of its mean and can increase with the value of the mean e.g. Poisson distribution and is usually the case when modelling claims.
17
Q

Why is multivariate modelling useful for modelling health insurance claims?

A
  • Multivariate modelling is necessary when modelling multiple factors that are likely to be related or correlated to a certain extent.
  • If these interdependencies are not taken into account, there is a risk of over- or understating claims. Examples of factors used in health insurance claims that are likely to be correlated:
    • Age and family size
    • Age and chronic conditions
    • Maternity and gender
18
Q

How can GLM be used to estimate claims experience after an acquisition?

A
  • Can use an existing GLM built off of claims data if it exists, or alternatively a GLM model will need to be built:Wh
  • Collect claims data for policies based on historical claims cost per benefit type and per benefit option.
  • Need to ensure the credibility of data per cell.
  • Claims will need to be determined on a per life per month (PLPM) basis per benefit option.
  • This is done by taking the total claims per cell and dividing by the exposure in that cell.
  • If using historical claims data, may need to adjust claims cost for inflation, run-off and any benefit changes.
  • Identify the risk factors that need to be accounted for (usually the ones that influence claims behaviour) such as age, gender, chronic status.
  • The choice of GLM and link function needs to be reasonable and appropriate.
  • A gamma model may be a good option for claim amounts.
  • Running the GLM will give a set of factors per risk factor (age, gender, chronic status), benefit amount type and benefit option type
  • These factors can then be used to score StarMed members based on their risk factors and the option they have been mapped to
  • This will yield claims estimates for StarMed members if they were on Target Health’s benefit options
  • These costs can be compared to the existing Target Health policies on that benefit option as well as compared to the expected premium income to determine the extent of a surplus / deficit
  • For accuracy, sensitivity testing should be done, as well as a variety of scenarios determined.
19
Q

How can GLM model be tested for appropriateness?

A
  • The hat matrix is one of the outputs of the model-fitting process. The diagonal entries of the hat matrix are called leverages → measure the influence that each observed value has on the fitted value for that observation. Data points with high leverages or residuals may distort the outcome and accuracy of a model.
  • Deviance residuals are a measure of the distance between the actual observation and the fitted value (the deviance corrects for the skewness of the distributions). Any large deviations indicate that the distributional assumptions are being violated.
  • The Standardised Pearson residual** is the difference between the observed response and the predicted value, adjusted for the standard deviation of the predicted value and the leverage of the observed response.
  • A residual plot (plotting residuals against fitted values of the response variable) for an appropriate response variable should produce residuals that are symmetrical about the x-axis, have an average residual of zero and fairly constant across the width of the fitted values.
  • Cook’s distance may also be used as a measure of the influence of a data point on the model results. Points with a Cook’s distance of 1 or more are considered to merit closer examination in the model analysis.
20
Q

How can the significance of explanatory variables in GLM model be tested?

A
  • Check which variables are statistically significant against a computer-generated p-value threshold
  • The chi-squared test
  • The F statistic
  • The Akaike Information Criteria (AIC)
  • Comparisons over time
    • Analysis of claims frequency by factor by year will indicate whether claims frequencies have been stable over time.
    • NB for pricing work.
    • If not consistent, the model average will be inappropriate for future periods.
  • Consistency checks with other factors
    • E.g. an explanatory variable such as age would be expected to show the same pattern regardless of geographical region. (Fitting a model that includes an interaction between age and geographical region and plotting a graph of results would highlight regions where the effect of age was different from the others.) Important for multi-distribution channels.
    • A random factor could be created in the data, as a means to check the consistency of the effect on the response of a particular explanatory variable. Each of the randomly allocated groups would be expected to behave in a similar way to each of the other groups and to the whole data.
21
Q

How can a GLM model be refined after developed?

A
  • Interactions can be included in a model. These may be complete or marginal.
  • Offset term is not estimated by the model but is treated as a fixed value. It helps to incorporate varying term (e.g exposure when modeling claim count) without incorporate it into the model. The fitted relativities for the other factors adjust to compensate.
  • Aliasing occurs within GLMs. Intrinsic aliasing occurs because of dependencies inherent within the definition of the covariates. This is dealt with by modelling software. Extrinsic aliasing occurs when two or more factors contain levels that are not perfectly correlated. “Near aliasing” occurs when this correlation is almost, but not quite, perfect.
  • A GLM can be improved by smoothing the parameter values. This can be achieved by grouping levels of factors.
  • There may be restriction on the use of GLMs in practice. These may be legal or commercial. An offset term can be used to adjust for restricted factors.
22
Q

How can you use a GLM to determine if new risk factor is appropriate?

A
  • Collect claims data for policies based on historical claims cost per benefit type and benefit option.
  • May need to incorporate external data if insufficient internal data.
  • To assess the distribution of claim amounts, use Gamma GLM → suitable when response variable is continuous and positively skewed
  • To assess distribution of claim incidence, use Poisson GLM → suitable for when response variable is a count of events over a period of time.
  • The link function might be log-link.
  • The different risk factors may include: age, gender, policy type and new risk factor.
  • Look at the coefficients for the risk factors if big in magnitude and statistical significance, they will have big effect on claim amount (or claim incidence)
  • Can compare different risk factor combination, to understand the risk associated with certain policy types or certain customers.
  • The GLM has be used to predict claims incidence rates (Poisson) and claim amounts (gamma)
23
Q

What is a link function used in GLM?

A
  • It removes assumptions that effects of different explanatory variables must just be added together.
  • The link function must be differentiable so that we can estimate coefficients of variables using MLE
  • Link function must be monotonic (doesn’t increase or decrease), so that relationship between mean of response variable and linear predictor is consistent and predictable.
  • Typically: log, logit and identity function.
  • Log function used in pricing because effects of rating factors are just multiplied together.
    • Model can capture interaction effects between factors.
    • Easy to interpret coefficients of risk factors as represent effect of risk factor on response variable.
24
Q

What are deterministic models?

A
  • Each parameter is assigned a single value
  • The model produces a single point estimate result
  • Sensitivity analysis done by varying the value of the parameter
  • Examples: chain-ladder method, formula approach to pricing PMI.
  • It cannot model complex products e.g. LTC
  • Results are very sensitive to changes in parameters so high risk of inaccurate results.
  • Parameters cannot account for variation in claims e.g. due to high unexpected claim amounts
25
Q

Why are stochastic models useful for health insurance policies?

A
  • Future incidence experience is difficult to predict
  • It is difficult to predict potential benefit amount
  • Cashflows are very uncertain and volatile, so best to project distribution of possible future outcomes. Uncertainty and volatility cashflows is important to be able to project distribution of possible future outcomes.
26
Q

When do we use stochastic modelling?

A
  • When assessing the impact of guarantees
  • When the variable of interest does have a reasonably stable and predictable probability distribution (e.g. investment returns in a developed country under stable economic and political conditions)
  • For indicating the effect of year-on-year volatility on risk (random fluctuations)
  • For identifying potentially high-risk future scenarios (e.g. by tracing back the seq. of events that have led to you worst simulated outcomes)
27
Q

What are the drawbacks of stochastic models vs. deterministic models?

A
  • Time and computing constraints - may be done on a very simplified version of the model
  • The sensitivity of the results to the assumed values of the parameters involved
  • So may lead to spurious accuracy - the great complexity of the model may give the appearance of highly precise answers, but with little confidence in the parameter values give little confidence to the answers
28
Q

To mitigate issue that models have with being sensitive to parameter values, what can be done/

A
  • Regularly review and update the assumptions based on the latest available data and experience.
  • Conduct sensitivity analyses to assess the impact of changes in key parameters on the premium rates and financial results.
  • Use appropriate margins or loadings to account for uncertainty in the assumptions and provide a buffer against adverse experience.
  • Supplement deterministic models with other techniques, such as scenario testing or stochastic modeling, to gain a more comprehensive understanding of the risks and potential outcomes.
29
Q

Tweedie distribution

A

Part of the exponential family
Can directly model claims incurred data because has a point mass at zero
A lot of claims incurred data have a zero because Ph didn’t claim during that period
If p is between 1 and 2 then use Poisson-Gamma distribution which is perfect as Poisson models frequency of claims and Gamma can model severity of claims.