Quantitative Analysis Flashcards by Andreas Schüpbach

Describe and distinguish between continuous and discrete random variables.

A discrete random variable x can only take values from a finite set of values: x_i∈{x_1,…,x_max}. Each value of x has a specific probability of appearing: P[X=x_i ]=p_i. The sum of all these discrete probabilities must always equal 1. An example of a discrete random variable could be a bond’s rating (AAA, AA, A, BBB, etc.).
A continuous random variable x can take values from an infinite set of values, i.e. it can take any value in a specified interval: x_i∈[x_min,x_max]. While there is no meaningful probability of a specific value of xi, the probability that it takes a value in a specific interval can be defined as P[r_1

How well did you know this?

Not at all

Perfectly

Define and distinguish between the probability density function, the cumulative distribution function, and the inverse cumulative distribution function.

The probability density function (PDF) defines the likelihood of a variable taking value within a specified interval: PDF=∫_(r_1)^(r_2)▒〖f(x)dx=p〗. Where r1 and r2 define the minimum and maximum values of x, the probability of x lying within the two must always be 1.
The cumulative distribution function (CDF) defines the likelihood a variable being smaller than a specified value: CDF=P[X≤a]=∫_(-∞)^a▒f(x)dx. The CDF is thus achieved by integrating the PDF from minus infinity to our specified value. Likewise, the PDF is achieved by taking the first derivative of the CDF. Consequently, the probability of a random variable taking value within an interval can be calculated by calculating and subtracting the CDF for the upper and lower bound.
The inverse cumulative distribution function defines for a specific probability the random variable will be lower than the corresponding value: Inverse CDF=F^(-1) (p)=x. To calculate the value, define the CDF, insert the provided probability p and solve for the corresponding value x.

How well did you know this?

Not at all

Perfectly

Calculate the probability of an event given a discrete probability function.

Event A or Event B taking place: P[A∪B]=P(A)+P(B) for mutually exclusive events
Event A and Event B taking place: P[A∩B]=P(A)*P(B) for independent events

How well did you know this?

Not at all

Perfectly

Distinguish between independent and mutually exclusive events.

Mutually exclusive events define two events that cannot happen simultaneously, such as returns on a specific stock on a specific day (only one value can materialise).
Independent events define two events that can happen simultaneously, and where the materialisation of one event does not provide any information regarding the other event.

How well did you know this?

Not at all

Perfectly

Define joint probability, describe a probability matrix, and calculate joint probabilities using probability matrices.

The joint probability of two events describes the likelihood of both events materialising simultaneously. For independent events, the likelihood is simply the product of the individual event’s probabilities. For correlated events, this cannot be applied, but the Bayesian theorem can be used.
A probability matrix shows all possible realisations of two random variables as well as the joint probabilities of all these realisations. The probabilities of all realisations of a random variable (i.e. the sum of sums of all rows/columns) must sum up to 100% (i.e. one realisation must occur):
For the calculation of the joint probabilities, use either P[A∩B]=P(A)*P(B) for independent events or Bayes’ rule for conditional probabilities.

How well did you know this?

Not at all

Perfectly

Define and calculate a conditional probability and distinguish between conditional and unconditional probabilities.

While the unconditional probability of the realisation of a random variable denotes the likelihood without regard to any other random variables that it might be related to (P[A]), the conditional probability denotes the likelihood conditional on another random variables’ realisation (P[A│B]).
For the realisations A and B two dependent random variables, the conditional probability P[A│B] is defined by Bayes rule: P(A│B)=(P(B│A)P(A))/(P(B)). For the calculation of the conditional probability, it is often necessary to calculate the value of either P(A) or P(B), which can often be calculated as weighted average: P(A)=P(B)P(B│A)+P(B ̅ )*P(B ̅│A).

How well did you know this?

Not at all

Perfectly

Interpret and apply the mean, standard deviation, and variance of a random variable.

The mean refers to the average value of a random variable.
The variance describes the average deviation of realisations of the random variable from its (squared) mean. The variance carries information about how much the random variable scatter around the mean, and thus how wide the distribution of the random variable is.
The standard deviation is the first root of the variance. The concept is very closely related to the concept of volatility but is narrower and more focused on descriptions of historical data.

How well did you know this?

Not at all

Perfectly

Calculate the mean, standard deviation, and variance of a discrete random variable.

Mean (unweighted): μ ̂=1/n ∑▒r_i =1/n(x_1+x_2+⋯+x_(n-1)+x_n)
Mean (weighted): μ ̂=(∑▒〖w_i x_i 〗)/(∑▒w_i ), for probabilities: μ ̂=∑▒〖p_i x_i 〗
Variance: σ^2=E[(X-μ^2 )]=1/(n-1) ∑▒(x_i-μ_x )^2 =E[X^2 ]-E[X]^2
Standard deviation: σ=√(σ^2 )

How well did you know this?

Not at all

Perfectly

Interpret and calculate the expected value of a discrete random variable.

The expected value E[X] is very closely related to the mean value of a variable as well as the fair price of a payoff distribution. While the mean is more focused on backward-looking descriptions, the expected value is more forward-looking with stronger assumptions around the data generation process (data was and will be generated by same process). The expected value operator is linear, thus: E[X+Y]=E[X]+E[Y] and E[cX]=cE[X]. However, it is not multiplicative: E[X^2 ]≠E[X]^2.

How well did you know this?

Not at all

Perfectly

Calculate and interpret the covariance and correlation between two random variables.

Covariance considers the relationship between the deviations of two random variables from their respective means. Variance emerges from covariance as a special case where we would measure the covariance of a variable with itself.
Covariance for two variances will be positive when the deviations share common signs and negative when deviations have opposite signs. σ_XY=E[XY]-E[X]E[Y]=1/(n-1) ∑▒〖(x_i-μ_x)(y_i-μ_y)〗. If the means are known, the fraction can be changed to 1/n.
Correlation is closely related to covariance and normalises the relationship to a value between -1 and 1. The further away the correlation is from 0, the stronger the relationship between two variables. ρ_XY=σ_XY/(σ_X σ_Y )

How well did you know this?

Not at all

Perfectly

Calculate the mean and variance of sums of variables.

Variance: σ_(X+Y)^2=σ_X^2+σ_Y^2+2ρ_XY σ_X σ_Y for two variables. For n variables (Y=∑▒X_i ): σ_Y^2=∑▒∑▒〖ρ_ij σ_i σ_j 〗.
Mean: due to the expected value operator being linear: E[X+Y]=E[X]+E[Y].

How well did you know this?

Not at all

Perfectly

Describe the four central moments of a statistical variable or distribution: mean, variance, skewness, and kurtosis.

The concept of central moments can be generalised as follows: μ_k=E[(X-μ)^k]. The central moment can be standardised by dividing it with σ^k.

How well did you know this?

Not at all

Perfectly

Interpret the skewness and kurtosis of a statistical distribution, and interpret the concepts of coskewness and cokurtosis.

Skewness, the standardised third central moment, describes how symmetrical the distribution is around the mean. A random variable that is perfectly symmetrical around the mean would have a skewness of zero, e.g. a normal distribution. Ceteris paribus, an investment with a more negative skew would be considered riskier. Many financial instruments appear to exhibit negative skew. As a rule of thumb, instruments with negative skew have a mean that is less than the median that is less than the mode.
Skewness=(E[(X-μ^3 )])/σ^3 =n/((n-1)(n-2)) ∑▒((x_i-μ ̂)/σ ̂ )^3 =(E[X^3 ]-3μσ^2-μ^3)/σ^3 .
Coskewness, the third cross central moment, helps us to identify whether two variables will tend to show extreme deviations of the mean (e.g. large price falls) at the same time. There will always be multiple values for the coskewness, increasing with the number of variables.
Kurtosis, the standardised fourth central moment, also describes how spread out a distribution is, but focuses more on the extreme points (i.e. the tails). Variables with higher kurtosis have fatter tails, and many financial instruments show positive kurtosis. Excess kurtosis, referring to the normal distribution, is calculated by subtracting 3. Kurtosis=(E[(X-μ)^4])/σ^4 =(n(n+1))/((n-1)(n-2)(n-3)) ∑▒((x_i-μ ̂)/σ ̂ )^4 .
Cokurtosis, the fourth cross central moment, helps to identify whether two variables will tend show extreme values at the same time, and is this closely related to coskewness. As for coskewness, there are multiple values of cokurtosis, increasing with the number of variables.

How well did you know this?

Not at all

Perfectly

Describe and interpret the best linear unbiased estimator.

The best linear unbiased estimator (BLUE) is the estimator with the lowest variance for an unknown population metric (e.g. mean) that can be represented in a linear function and is unbiased. Unbiased refers to the estimator’s expected value being equal to the true value of the unknown metric.

How well did you know this?

Not at all

Perfectly

Uniform distribution

The uniform distribution resembles a constant probability density function between a defined lower (b1) and upper (b2) bound. For the bounds being 0 and 1, respectively, the distribution is called standard uniform distribution.

How well did you know this?

Not at all

Perfectly

Bernoulli distribution

Study These Flashcards

The Bernoulli distribution shows a single realisation of a Bernoulli variable, which is 1 with probability p and 0 with probability q = 1 – p. The distribution will show up when modelling occurrences of binary events, such as bond defaults, whether a return will be positive or negative, or whether rates fall or rise.

Binomial distribution

Study These Flashcards

The Binomial distribution can be considered as the collection of realisations of multiple Bernoulli variables, for example the defaults of different bonds. For k variables, the distribution will yield the probability of n out of k variables realising a value of 1.

Poisson distribution

Study These Flashcards

The Poisson distribution is often used to model the occurrence of events over time, e.g. the number of bond defaults in a portfolio. It can be used to model jumps in jump-diffusion models.

Normal distribution

Study These Flashcards

The normal distribution (also called Gaussian distribution) has a symmetrical probability density function, with the mean and median coinciding at the maximum of this PDF. Formally, the normal distribution is depicted by X~N(μ,σ^2). Normally distributed log returns are used across finance, for example is the Black-Scholes option pricing model.

Lognormal distribution

Study These Flashcards

The lognormal distribution allows us to model standard returns directly, without applying a log to them. This applies because the log of a lognormally distributed variable has a normal distribution. The lognormal distribution is especially useful for modelling returns (1+R) because is limits the left side to R=-100%, which is applicable to all limited liability financial instruments.

Chi-squared distribution

Study These Flashcards

The Chi-squared distribution is achieved by combining the square of k standard normal variables. It can only depict positive values (because of the square) and is thus asymmetrical. The Chi-squared distribution is used especially for hypothesis testing.

Student’s-t distribution

Study These Flashcards

Student’s t-distribution is widely used for modelling returns of financial assets and in hypothesis testing. With increasing k, the t-distribution will converge towards the standard normal distribution. The t-distribution is symmetrical around its mean, which is 0. The variance is given by k/(k-2) for k > 2, and converges toward 1 with increasing k.

F-distribution

Study These Flashcards

The F-distribution is based on two independent chi-squared distributions U1 and U2 with k1 and k2 degrees of freedom. With increasing k1 and k2, the mean and mode of the distribution will converge towards 1, with the F-distribution converging towards the normal distribution. The square of a variable X with a t-distribution will have an F-distribution. If X is a random variable with a t-distribution and k degrees of freedom, then X2 has an F-distribution with 1 and k degrees of freedom: X^2~F(1,k).

Describe the central limit theorem and the implications it has when combining independent and identically distributed (i.i.d.) random variables.

Study These Flashcards

The central limit theorem states that if we have n i.i.d. random variables, each with mean μ and standard deviation σ, and we define the sum of these variables as S_n, then our combined distribution will converge to a normal distribution as we approach infinite. Mathematically: lim┬(n→∞)⁡〖S_n~N(nμ,〖nσ〗^2 )〗.
The central limit theorem allows us to approximate a large number of data (e.g. a portfolio) using a normal distribution.

Describe i.i.d. random variables and the implications of the i.i.d. assumption when combining random variables.

- Independent and identically distributed (i.i.d.) variables represent the assumption that a set of variables share a common distribution (e.g. normal distribution) and are independent (i.e. uncorrelated) from each other. - The i.i.d. assumption is convenient when looking at a portfolio of instruments, because it allows us to combine variables that are not normally distributed to achieve a normal distribution.

Describe a mixture distribution and explain the creation and characteristics of mixture distributions.

- A distribution that results from a weighted average distribution of density functions is called mixture distribution. - The mixture distribution is created by first drawing from different distributions according to their respective probability and then create a realisation of the variable in accordance with the drawn distribution: f(x)=∑_(i=1)^n▒〖w_i f_i (x) s.t.∑_(i=1)^n▒〖w_i=1〗〗. The various fi(x) are known as component distributions. - Mixture distributions are flexible and stand between parametric and nonparametric distributions. Usually, the component distributions are parametric, while the weights are based on empirical data and are thus nonparametric. In general, by adding together more component distributions, we achieve a better fit, but our capability to generalise decreases. - For example, we can model excess kurtosis or skew, which is observed for many financial instruments, by adding together multiple normal distributions. - Mixture distributions of two independent variables with equal variance will always have a variance that is at least as high the individual variance, depending on the mean. The mean of the mixture distribution is simply the weighted average of the two means.

Quantitative Analysis Flashcards

(26 cards)