CS2 - Part 3 Flashcards
General formula for Cox proportional hazard (PH) model

Ratio of hazards of lives with covariate vectors z1 and z2 (Cox PH model)

Proportional hazards model: Likelihood estimator for beta vector

Aims of graduation
- Produce smooth set of rates that are suitable for a particular purpose
- Remove random sampling errors
- Use the information available from adjacent ages
Desirable features of graduation
- Smoothness
- Adherence to data
- Suitability to purpose to hand
Degrees of freedom for Xi-Squared test
- Start with the number of groups
- If the groups form a set of mutually exclusive and exhaustive categories (probabilities add up to 1), subtract 1
- Subract further 1 for each parameter that has been estimated
Distributions of D_x and mu~x

Mortality experience: Deviation

Mortality experience: Standardised deviation

Degrees of freedom when comparing an experience with a standard table
Degrees of freedom = number of age groups
Xi-squared failures: Standardised deviations test
To detect a few large deviations that the Xi-square test did not detect
Check if standardised deviations of mortality are following the standard normal distribution with Xi-Squared test
Xi-squared failures: Signs test
To detect imbalance between negative and positive deviations
Binomial distribution
N number of negative deviations:
Check that 2*P(N <= x) > 5%
P number of positive deviations:
Check that 2*P(P >= x) > 5%
Xi-squared failures: Cumulative deviations

Xi-squared failures: Grouping of signs test
Detects ‘clumping’ of devations with the same sign.
Check ‘Grouping of signs test’ in tables.
If number of groups of positive (or negative) runs is lower or equal than the test statistic, we can reject the null hypothesis.
Testing smoothness of graduation
Third difference (change in curvature) of the graduated quantities should
- Be small in magnitude compared with the quantities themeselves
- Progess regularly
Methods of graduation
- Graduation by parametric formula
- a1 + a2 exp(a3x + a4x^2+…)
- well-suited to the production of standard tables from large amounts of data
- Graduation by reference to standard table
- (a+bx) mu_x^s
- Can be used to fit relatively small data sets where a suitable standard table exists
- Gradution using spline functions
- Method is suitable for quite small experiences as well as very large experiences.
Morality projection - Method based on expectation

Autocovariance function

Simplify:


Autocorrelation function

Correlation formula

Autoregressive process of order p
AR(p)

Moving average process of order q
MA(q)

Autoregressive moving average
ARMA(p,q)

Condition for stationarity of AR(p) process

Conditions for invertibility of MA processes
Invertibility: White noise process e can be written explicitly in terms of X process

Moving average model MA(q), in backwards shift notation

ARMA(p,q) process defined in Backward operation notation

Definition of an ARIMA process

Features of MA(q) process

Features of AR(p) process

Features of ARMA (p,q) process

Three possible causes of non-stationarity
- Deterministic trend (e.g. exponential or linear growth)
- Deterministic cycle (e.g. seasonal effect)
- Time series is integrated
Methods for compensating for trend/seaonality (6)
- Least squares trend removal (Tables p.24)
- Differencing
- Differencing d times will not only make I(d) series stationary but will also remove linear trend
- Seasonal differncing
- E.g. differencing 12 times for annual seasonality
- Method of moving averages
- Create transformation such that transformed time series is moving average of original time series
- Method of seasonal means
- Transformation of the data
- E.g. take log
Check if observed time series is stationary
Autocorrelation function should converge to 0 exponentially
Identification of white noise
Option 1:
- Check if values of the SACF or SPACF fall outside the range of
- +-2/sqrt(n) –> Approximated from +-1.96/sqrt(n)
- Note that there is a chance of 1/20 that one value will fall out of the range (95% quantile)
Option 2:
- Portmanteau test (tables p. 42)
Identification of MA(q)

Identification of AR(q)

Identification of appripriate order of differencing (d) of sample data
- Slowly decaing sample autocorrelation function indicates time series need to be differenced
- Look for smallest sample variance for d=1,2,3,…
Diagnostic checking for fitted ARIMA model

Condition for stationarity of vector autoregressive process

Calculate eigenvalues of matrix A
Values lambda, such that
det (A-lambda*I) = 0
Two time series processes X and Y are called cointegrated if:
- X and Y are I(1) random processes
- there exists a non-zero vector (a,b) such that aX+bY is stationary
The vector (a,b) is called cointegration vector.
Moment generating function (formula)

Cumulant generating function

Coefficient of skewness

Kurtosis
- Fourth standardised moment
- kurtosis = 3: mesokurtic (normal distribution)
- kurtosis >3 leptokurtic
- more peaked, fatter tail
- kurtosis <3 platykurtic
- broader peak, more slender tails
Standardised moment

Varying volatility over time
heteroscedacity
Central limit theorem

Generalized extreme value distribution

GEV distributions: Different values of shape parameter gamma

Rough criteria to chose family of GEV distributions

Distribution of excess above u

kth moment of a continuous positive-valued distribution with density function f(x)

Measures of tail weight

Coefficient of upper tail dependence

Coefficient of lower tail dependence in terms of the copula function

Coefficient of upper tail dependence in terms of the copula function

Fundamental copulas

Graphical representation of independence copula

Graphical representation of comonotonous copula

Graphical representation of counter-monotonic copula

Gumbel copula
- Upper tail dependence determined by parameter alpha
- No lower tail dependence

Clayton copula
- Lower tail dependence determined by alpha
- No upper tail dependence

Frank copula
- Interdependence structure in which there is no upper or lower tail dependence

Gaussian copula

Archimedean copula

Student’s t copula

Tail dependence of all copulas

PDF of the reinsurer’s claim amount under XOL with retention M

Variance, mean and skewness of compound poisson process with parameter lambda

Coefficient of skewness of compound poisson distribution

Sum of independent compound Poisson random variables

n choose k

Machine Learning: Confusion matrices

Machine Learning: Hyperparameters
Variables external to the model whose values are set in advance by the user. They are chosen based on the user’s knowledge and experience in order to produce a model that works well.
Machine Learning: Parameters
variables internal to the model whose values are estimated from the data and are used to calculate predictions using the model.
Machine Learning: Regularisation or penalisation

Branches of Machine Learning

Machine Learning: Stages of analysis

Machine Learning: Data Types

Machine Learning: Train-Validate-Test approach
Split data into
- data for training (60%)
- data for validation (20%)
- data for testing (20%)
Machine Learning: Requirements for analysis to be reproducible
- Data used should be fully described and available to other researchers
- Any modification to the data should be clearly described
- Selection of the algorithm and the development of the model should be desribed (including parameters and why they are chosen)
- Ideally would provide computer code used
- Specify seed value
Machine Learning: Penalised general linear models
Maximize penalized likelihood

Machine Learning: Naive Bayes Classification

Machine Learning: Gini index of a final node in a decision tree

Machine Learning: Gini index of a decision tree

Machine Learning: K-means clustering advantages and disadvantages
