Module 16: Statistical Distributions Flashcards

Question 1

Q

Binomial distribution description

Answer

A

Bin(n, p)

Models the number of successes in n independent trials where p is the probability of success.

Question 2

Q

Negative binomial distribution description

Answer

A

NBin(r, p)

Models the number of trials needed until there have been r successes.
if r=1, the distribution is known as the geometric distribution

Question 3

Q

Poisson distribution description

Answer

A

Poi(λ)

Models the number of independent events occurring in a specified time period
used as an approximation to the binomial distribution for small p

Question 4

Q

Normal distribution

Answer

A

mathematically tractable distribution (easy to parameterise and use), useful when little is known about the data
used as an approximation to the binomial and Poisson distributions when the sample size is large
used to model the error terms in a random walk
symmetrical and mesokurtic

Question 5

Q

Central Limit Theorem

Answer

A

by the Central Limit Theorem, the distribution of the average, X_bar, of a large sample of iid random variables with finite mean, μ, and finite variance, σ², is Normally distributed.

~ N ( μ , σ²/n )

Question 6

Q

2 Tests for normality

Answer

A

QQ plots

- Jarque-Bera test

Question 7

Q

Generalised student’s t-distribution

Answer

A

Used to model symmetric data sets where the tails are fatter than implied by a normal distribution (leptokurtic) - important distribution for modelling risks
Can be derived as a normal mean-variance mixture distribution

Question 8

Q

Lognormal distribution

Answer

A

frequently used to model financial data that takes positive values only, eg asset prices, or insurance claim amounts
positively skewed

Question 9

Q

Wald (or inverse Gaussian) distribution

Answer

A

models the time taken for a random walk with drift to reach a particular level
positively skewed with useful properties in terms of aggregation

Question 10

Q

Chi-square distribution

Answer

A

Used for goodness of fit
represents the sum of v squared independent standard normal random variables
positively skewed

Question 11

Q

Exponential Distribution

Answer

A

Models expected time between observations under a Poisson process
monotonically decreasing, positively skewed, tail decreases exponentially
inflexible due to single parameter and unlikely to provide a good fit to data

Question 12

Q

Gamma Distribution

Answer

A

extension of exponential distribution
flexible and has useful properties in terms of aggregation
if X has a gamma distribution then Y = 1/X has an inverse gamma distribution

Question 13

Q

Generalised Inverse Gamma distribution

Answer

A

can produce a wide range of shapes - flexible as has three parameters
monotonically decreasing, positively skewed inflexible as single parameter

Question 14

Q

Pareto distribution

Answer

A

used for modelling variables where the probability of an event falls in proportion to the magnitude of the event raised to a power, eg the distribution of wealth or the population of cities
the tail of the distribution follows the power law

Question 15

Q

Generalised Pareto Distribution

Answer

A

Flexible distribution used in extreme value theory

Question 16

Q

Triangular Distribution

Answer

A

Useful when the following limited data is available:

the minimum value
the maximum value
the mode

Question 17

Q

Multivariate Distribution

Answer

A

A way of modelling several random variables at once

Question 18

Q

Multivariate Normal Distribution

Answer

A

A column vector random variable, X, has a multivariate normal distribution if X = α + CZ where

α is a column vector of location parameters (ie means),
Z is a k-dimensional vector of iid standard normal random variables and
Σ is the covariance matrix and C is a matrix of constants such that CC’ = Σ

Question 19

Q

2 Useful tests for testing whether observations are from a multivariate normal distribution

Answer

A

Mahalanobis distance

- Mardia’s test, based on the Mahanalobis angle

Question 20

Q

2 Common approaches for generating correlated multivariate normal random variables

Answer

A

Cholesky decomposition

- Principle components

Question 21

Q

Cholesky decomposition

Answer

A

A way of “square-rooting” a matrix.

It is used to derive the matrix C, such that CC’ = Σ.

If a vector, Z of iid standard normal random variables is generated, then a vector, X, of correlated normal random variables can be generated as X = μ + CZ where μ is the vector of means.

Question 22

Q

Principal Component Analysis

Answer

A

a.k.a. eigenvalue decomposition

Provides a way of decomposing the covariance matrix, Σ, as Σ = VΛV’ where Λ is the diagonal matrix of eigenvalues and V is the matrix of corresponding eigenvectors.

Each pair consisting of an eigenvalue and its corresponding eigenvector is called a principal component. These can be derived iteratively.

Question 23

Q

List 2 univariate discrete distributions

Answer

A

binomial and negative binomial distributions

- the Poisson distribution

Question 24

Q

List 2 univariate continuous distributions taking values from -∞ to + ∞, and a variation of each

Answer

A

the normal distribution
normal mixture distribution
Student’s t-distribution
the skewed t-distribution

Question 25

Q

List 9 univariate continuous distributions taking only non-negative values

Answer

A

lognormal distribution
Wald distribution
Chi-squared distribution
gamma and inverse gamma distributions
generalised inverse gamma distribution
exponential distribution
Frechet distribution
Pareto distribution
generalised Pareto distribution

Question 26

Q

Outline what the binomial distribution aims to model

Answer

A

A Bin(n, p) distribution is the sum of n independent and identical Bernoulli(p) trials.

Random variable X ~ Bin(n, p) is the number of successes that occur in the n trials.

The limiting distribution of the binomial distribution as n -> ∞

Question 27

Q

Outline what the negative binomial distributions (Type 1 and Type 2) aim to model

Answer

A

Type 1: Random variable X is the number of the on which the rth success occurs, where r is a positive integer.

Type 2: Let Y be the number of failures before the rth success. Y = X - r, where X is defined as above.

Question 28

Q

Outline what the Poisson distribution aims to model

Answer

A

The Poisson distribution models the number of events (eg claims) that occur in a specified interval of time, when the events occur one after another in time in a well-defined manner.

This manner presumes that the events occur singly at a constant rate, and that the numbers of events that occur in separate (ie non-oiverlapping) time intervals are independent of one another.

These conditions can be described by saying that the events occur “randomly, at a rate of λ per period”.

Such events are said to occur according to a Poisson process.

Question 29

Q

State the location and scaling parameters of the standard normal distribution

Answer

A

The standard normal distribution has a location parameter (and mean) of 0, and a scaling parameter (and standard deviation) of 1.

Question 30

Q

State why the t-distribution is an important distribution for risk modelling

Answer

A

The kurtosis of the standard t-distribution is greater than that of the normal distribution.

The fact that the t-distribution is leptokurtic (relatively fatter tails) makes this an important distribution for risk modelling.

Question 31

Q

State what is meant by X having a lognormal distribution

Answer

A

If Y = lnX (the natural log) has a normal distribution, then X is said to have a lognormal distribution.

Question 32

Q

Outline 2 specific applications of the lognormal distribution, in the context of modelling financial risks

Answer

A

Since it takes only positive values, and is skewed, it is applicable to many insurance situations, eg claim size.
It can be used to model financial variables, eg asset returns, with assumptions that the natural logarithm of the variable will follow a random walk drift lnXₜ = μ + ln Xₜ₋₁ + eₜ, and that the returns are iid.

Question 33

Q

State what the Wald distribution describes in terms of a probability

Answer

A

The Wald distribution describes the time taken for a Brownian motion process to reach a given value.

Question 34

Q

State what the chi-squared distribution describes in terms of a probability

Answer

A

The chi-squared distribution with γ degrees of freedom is the distribution of the sum of γ squared independent variables taken from a standard normal distribution, and so can be simulated as such.

Question 35

Q

State what the exponential distribution models

Answer

A

The exponential distribution provides the expected waiting times between the events of a Poisson process.

Question 36

Q

List the characteristics of the exponential distribution that limit its application to ERM

Answer

A

The exponential distribution’s application is limited by:

its monotonically-decreasing nature
its single parameter
the low probabilities associated with extreme events

Question 37

Q

State what is meant by X having an inverse-gamma distribution

Answer

A

If Y ~ Gamma, then X = 1/Y ~ InverseGamma

Question 38

Q

State how the gamma (and inverse-gamma) can be fitted to a sample

Answer

A

Both gamma and inverse-gamma can be fitted by equating sample and population moments and solving for the distribution’s parameters.

Question 39

Q

2 Key features of the Pareto distribution

Answer

A

The Pareto distribution is monotonically decreasing and, like the tails of the t-distribution, follows a power law with the shape parameter (γ) determining the power.

Question 40

Q

Uniform distribution

Answer

A

Assigns an equal probability to all outcomes in a range

Question 41

Q

State the key features of the triangular distribution

Answer

A

The triangular distribution can be used in cases where, in addition to the upper and lower values, the most likely value is known. The distribution has lower limit β₁, mode α, and upper limit β₂

The mean is the average of the parameter values:
μ = ⅓( β₁ + α + β₂)

Question 42

Q

Outline 3 key limitations that mean the multivariate normal distribution is not a good description of reality in many risk management applications

Answer

A

the tails of the univariate marginal distributions are too thin
the joint tails do not assign enough weight to join extreme outcomes
the distribution has a strong form of symmetry, known as elliptical symmetry.

Question 43

Q

Define what is meant by a multivariate sperical distribution, and name a specific example

Answer

A

A multivariate spherical distribution is one where the marginal distributions are:

identical
symmetric
uncorrelated with each other (note, however, that lack of correlation does not necessarily imply independence)

Question 44

Q

Define what is meant by a multivariate elliptical distribution and name a specific example

Answer

A

If any chosen (fixed) probability can be described by an elliptical relationship between the variables then the distribution is said to be elliptical.