Module 16: Statistical Distributions Flashcards
Binomial distribution description
Bin(n, p)
Models the number of successes in n independent trials where p is the probability of success.
Negative binomial distribution description
NBin(r, p)
- Models the number of trials needed until there have been r successes.
- if r=1, the distribution is known as the geometric distribution
Poisson distribution description
Poi(λ)
- Models the number of independent events occurring in a specified time period
- used as an approximation to the binomial distribution for small p
Normal distribution
- mathematically tractable distribution (easy to parameterise and use), useful when little is known about the data
- used as an approximation to the binomial and Poisson distributions when the sample size is large
- used to model the error terms in a random walk
- symmetrical and mesokurtic
Central Limit Theorem
by the Central Limit Theorem, the distribution of the average, X_bar, of a large sample of iid random variables with finite mean, μ, and finite variance, σ², is Normally distributed.
~ N ( μ , σ²/n )
2 Tests for normality
- QQ plots
- Jarque-Bera test
Generalised student’s t-distribution
- Used to model symmetric data sets where the tails are fatter than implied by a normal distribution (leptokurtic) - important distribution for modelling risks
- Can be derived as a normal mean-variance mixture distribution
Lognormal distribution
- frequently used to model financial data that takes positive values only, eg asset prices, or insurance claim amounts
- positively skewed
Wald (or inverse Gaussian) distribution
- models the time taken for a random walk with drift to reach a particular level
- positively skewed with useful properties in terms of aggregation
Chi-square distribution
- Used for goodness of fit
- represents the sum of v squared independent standard normal random variables
- positively skewed
Exponential Distribution
- Models expected time between observations under a Poisson process
- monotonically decreasing, positively skewed, tail decreases exponentially
- inflexible due to single parameter and unlikely to provide a good fit to data
Gamma Distribution
- extension of exponential distribution
- flexible and has useful properties in terms of aggregation
- if X has a gamma distribution then Y = 1/X has an inverse gamma distribution
Generalised Inverse Gamma distribution
- can produce a wide range of shapes - flexible as has three parameters
- monotonically decreasing, positively skewed inflexible as single parameter
Pareto distribution
- used for modelling variables where the probability of an event falls in proportion to the magnitude of the event raised to a power, eg the distribution of wealth or the population of cities
- the tail of the distribution follows the power law
Generalised Pareto Distribution
Flexible distribution used in extreme value theory
Triangular Distribution
Useful when the following limited data is available:
- the minimum value
- the maximum value
- the mode
Multivariate Distribution
A way of modelling several random variables at once
Multivariate Normal Distribution
A column vector random variable, X, has a multivariate normal distribution if X = α + CZ where
- α is a column vector of location parameters (ie means),
- Z is a k-dimensional vector of iid standard normal random variables and
- Σ is the covariance matrix and C is a matrix of constants such that CC’ = Σ
2 Useful tests for testing whether observations are from a multivariate normal distribution
- Mahalanobis distance
- Mardia’s test, based on the Mahanalobis angle
2 Common approaches for generating correlated multivariate normal random variables
- Cholesky decomposition
- Principle components
Cholesky decomposition
A way of “square-rooting” a matrix.
It is used to derive the matrix C, such that CC’ = Σ.
If a vector, Z of iid standard normal random variables is generated, then a vector, X, of correlated normal random variables can be generated as X = μ + CZ where μ is the vector of means.
Principal Component Analysis
a.k.a. eigenvalue decomposition
Provides a way of decomposing the covariance matrix, Σ, as Σ = VΛV’ where Λ is the diagonal matrix of eigenvalues and V is the matrix of corresponding eigenvectors.
Each pair consisting of an eigenvalue and its corresponding eigenvector is called a principal component. These can be derived iteratively.
List 2 univariate discrete distributions
- binomial and negative binomial distributions
- the Poisson distribution
List 2 univariate continuous distributions taking values from -∞ to + ∞, and a variation of each
- the normal distribution
- normal mixture distribution
- Student’s t-distribution
- the skewed t-distribution
List 9 univariate continuous distributions taking only non-negative values
- lognormal distribution
- Wald distribution
- Chi-squared distribution
- gamma and inverse gamma distributions
- generalised inverse gamma distribution
- exponential distribution
- Frechet distribution
- Pareto distribution
- generalised Pareto distribution
Outline what the binomial distribution aims to model
A Bin(n, p) distribution is the sum of n independent and identical Bernoulli(p) trials.
Random variable X ~ Bin(n, p) is the number of successes that occur in the n trials.
The limiting distribution of the binomial distribution as n -> ∞
Outline what the negative binomial distributions (Type 1 and Type 2) aim to model
Type 1: Random variable X is the number of the on which the rth success occurs, where r is a positive integer.
Type 2: Let Y be the number of failures before the rth success. Y = X - r, where X is defined as above.
Outline what the Poisson distribution aims to model
The Poisson distribution models the number of events (eg claims) that occur in a specified interval of time, when the events occur one after another in time in a well-defined manner.
This manner presumes that the events occur singly at a constant rate, and that the numbers of events that occur in separate (ie non-oiverlapping) time intervals are independent of one another.
These conditions can be described by saying that the events occur “randomly, at a rate of λ per period”.
Such events are said to occur according to a Poisson process.
State the location and scaling parameters of the standard normal distribution
The standard normal distribution has a location parameter (and mean) of 0, and a scaling parameter (and standard deviation) of 1.
State why the t-distribution is an important distribution for risk modelling
The kurtosis of the standard t-distribution is greater than that of the normal distribution.
The fact that the t-distribution is leptokurtic (relatively fatter tails) makes this an important distribution for risk modelling.
State what is meant by X having a lognormal distribution
If Y = lnX (the natural log) has a normal distribution, then X is said to have a lognormal distribution.
Outline 2 specific applications of the lognormal distribution, in the context of modelling financial risks
- Since it takes only positive values, and is skewed, it is applicable to many insurance situations, eg claim size.
- It can be used to model financial variables, eg asset returns, with assumptions that the natural logarithm of the variable will follow a random walk drift lnXₜ = μ + ln Xₜ₋₁ + eₜ, and that the returns are iid.
State what the Wald distribution describes in terms of a probability
The Wald distribution describes the time taken for a Brownian motion process to reach a given value.
State what the chi-squared distribution describes in terms of a probability
The chi-squared distribution with γ degrees of freedom is the distribution of the sum of γ squared independent variables taken from a standard normal distribution, and so can be simulated as such.
State what the exponential distribution models
The exponential distribution provides the expected waiting times between the events of a Poisson process.
List the characteristics of the exponential distribution that limit its application to ERM
The exponential distribution’s application is limited by:
- its monotonically-decreasing nature
- its single parameter
- the low probabilities associated with extreme events
State what is meant by X having an inverse-gamma distribution
If Y ~ Gamma, then X = 1/Y ~ InverseGamma
State how the gamma (and inverse-gamma) can be fitted to a sample
Both gamma and inverse-gamma can be fitted by equating sample and population moments and solving for the distribution’s parameters.
2 Key features of the Pareto distribution
The Pareto distribution is monotonically decreasing and, like the tails of the t-distribution, follows a power law with the shape parameter (γ) determining the power.
Uniform distribution
Assigns an equal probability to all outcomes in a range
State the key features of the triangular distribution
The triangular distribution can be used in cases where, in addition to the upper and lower values, the most likely value is known. The distribution has lower limit β₁, mode α, and upper limit β₂
The mean is the average of the parameter values:
μ = ⅓( β₁ + α + β₂)
Outline 3 key limitations that mean the multivariate normal distribution is not a good description of reality in many risk management applications
- the tails of the univariate marginal distributions are too thin
- the joint tails do not assign enough weight to join extreme outcomes
- the distribution has a strong form of symmetry, known as elliptical symmetry.
Define what is meant by a multivariate sperical distribution, and name a specific example
A multivariate spherical distribution is one where the marginal distributions are:
- identical
- symmetric
- uncorrelated with each other (note, however, that lack of correlation does not necessarily imply independence)
Define what is meant by a multivariate elliptical distribution and name a specific example
If any chosen (fixed) probability can be described by an elliptical relationship between the variables then the distribution is said to be elliptical.