Quantitative Analysis Flashcards
Describe and distinguish between continuous and discrete random variables.
- A discrete random variable x can only take values from a finite set of values: x_i∈{x_1,…,x_max}. Each value of x has a specific probability of appearing: P[X=x_i ]=p_i. The sum of all these discrete probabilities must always equal 1. An example of a discrete random variable could be a bond’s rating (AAA, AA, A, BBB, etc.).
- A continuous random variable x can take values from an infinite set of values, i.e. it can take any value in a specified interval: x_i∈[x_min,x_max]. While there is no meaningful probability of a specific value of xi, the probability that it takes a value in a specific interval can be defined as P[r_1
Define and distinguish between the probability density function, the cumulative distribution function, and the inverse cumulative distribution function.
- The probability density function (PDF) defines the likelihood of a variable taking value within a specified interval: PDF=∫_(r_1)^(r_2)▒〖f(x)dx=p〗. Where r1 and r2 define the minimum and maximum values of x, the probability of x lying within the two must always be 1.
- The cumulative distribution function (CDF) defines the likelihood a variable being smaller than a specified value: CDF=P[X≤a]=∫_(-∞)^a▒f(x)dx. The CDF is thus achieved by integrating the PDF from minus infinity to our specified value. Likewise, the PDF is achieved by taking the first derivative of the CDF. Consequently, the probability of a random variable taking value within an interval can be calculated by calculating and subtracting the CDF for the upper and lower bound.
- The inverse cumulative distribution function defines for a specific probability the random variable will be lower than the corresponding value: Inverse CDF=F^(-1) (p)=x. To calculate the value, define the CDF, insert the provided probability p and solve for the corresponding value x.
Calculate the probability of an event given a discrete probability function.
- Event A or Event B taking place: P[A∪B]=P(A)+P(B) for mutually exclusive events
- Event A and Event B taking place: P[A∩B]=P(A)*P(B) for independent events
Distinguish between independent and mutually exclusive events.
- Mutually exclusive events define two events that cannot happen simultaneously, such as returns on a specific stock on a specific day (only one value can materialise).
- Independent events define two events that can happen simultaneously, and where the materialisation of one event does not provide any information regarding the other event.
Define joint probability, describe a probability matrix, and calculate joint probabilities using probability matrices.
- The joint probability of two events describes the likelihood of both events materialising simultaneously. For independent events, the likelihood is simply the product of the individual event’s probabilities. For correlated events, this cannot be applied, but the Bayesian theorem can be used.
- A probability matrix shows all possible realisations of two random variables as well as the joint probabilities of all these realisations. The probabilities of all realisations of a random variable (i.e. the sum of sums of all rows/columns) must sum up to 100% (i.e. one realisation must occur):
- For the calculation of the joint probabilities, use either P[A∩B]=P(A)*P(B) for independent events or Bayes’ rule for conditional probabilities.
Define and calculate a conditional probability and distinguish between conditional and unconditional probabilities.
- While the unconditional probability of the realisation of a random variable denotes the likelihood without regard to any other random variables that it might be related to (P[A]), the conditional probability denotes the likelihood conditional on another random variables’ realisation (P[A│B]).
- For the realisations A and B two dependent random variables, the conditional probability P[A│B] is defined by Bayes rule: P(A│B)=(P(B│A)P(A))/(P(B)). For the calculation of the conditional probability, it is often necessary to calculate the value of either P(A) or P(B), which can often be calculated as weighted average: P(A)=P(B)P(B│A)+P(B ̅ )*P(B ̅│A).
Interpret and apply the mean, standard deviation, and variance of a random variable.
- The mean refers to the average value of a random variable.
- The variance describes the average deviation of realisations of the random variable from its (squared) mean. The variance carries information about how much the random variable scatter around the mean, and thus how wide the distribution of the random variable is.
- The standard deviation is the first root of the variance. The concept is very closely related to the concept of volatility but is narrower and more focused on descriptions of historical data.
Calculate the mean, standard deviation, and variance of a discrete random variable.
- Mean (unweighted): μ ̂=1/n ∑▒r_i =1/n(x_1+x_2+⋯+x_(n-1)+x_n)
- Mean (weighted): μ ̂=(∑▒〖w_i x_i 〗)/(∑▒w_i ), for probabilities: μ ̂=∑▒〖p_i x_i 〗
- Variance: σ^2=E[(X-μ^2 )]=1/(n-1) ∑▒(x_i-μ_x )^2 =E[X^2 ]-E[X]^2
- Standard deviation: σ=√(σ^2 )
Interpret and calculate the expected value of a discrete random variable.
The expected value E[X] is very closely related to the mean value of a variable as well as the fair price of a payoff distribution. While the mean is more focused on backward-looking descriptions, the expected value is more forward-looking with stronger assumptions around the data generation process (data was and will be generated by same process). The expected value operator is linear, thus: E[X+Y]=E[X]+E[Y] and E[cX]=cE[X]. However, it is not multiplicative: E[X^2 ]≠E[X]^2.
Calculate and interpret the covariance and correlation between two random variables.
- Covariance considers the relationship between the deviations of two random variables from their respective means. Variance emerges from covariance as a special case where we would measure the covariance of a variable with itself.
- Covariance for two variances will be positive when the deviations share common signs and negative when deviations have opposite signs. σ_XY=E[XY]-E[X]E[Y]=1/(n-1) ∑▒〖(x_i-μ_x)(y_i-μ_y)〗. If the means are known, the fraction can be changed to 1/n.
Correlation is closely related to covariance and normalises the relationship to a value between -1 and 1. The further away the correlation is from 0, the stronger the relationship between two variables. ρ_XY=σ_XY/(σ_X σ_Y )
Calculate the mean and variance of sums of variables.
- Variance: σ_(X+Y)^2=σ_X^2+σ_Y^2+2ρ_XY σ_X σ_Y for two variables. For n variables (Y=∑▒X_i ): σ_Y^2=∑▒∑▒〖ρ_ij σ_i σ_j 〗.
- Mean: due to the expected value operator being linear: E[X+Y]=E[X]+E[Y].
Describe the four central moments of a statistical variable or distribution: mean, variance, skewness, and kurtosis.
The concept of central moments can be generalised as follows: μ_k=E[(X-μ)^k]. The central moment can be standardised by dividing it with σ^k.
Interpret the skewness and kurtosis of a statistical distribution, and interpret the concepts of coskewness and cokurtosis.
- Skewness, the standardised third central moment, describes how symmetrical the distribution is around the mean. A random variable that is perfectly symmetrical around the mean would have a skewness of zero, e.g. a normal distribution. Ceteris paribus, an investment with a more negative skew would be considered riskier. Many financial instruments appear to exhibit negative skew. As a rule of thumb, instruments with negative skew have a mean that is less than the median that is less than the mode.
Skewness=(E[(X-μ^3 )])/σ^3 =n/((n-1)(n-2)) ∑▒((x_i-μ ̂)/σ ̂ )^3 =(E[X^3 ]-3μσ^2-μ^3)/σ^3 . - Coskewness, the third cross central moment, helps us to identify whether two variables will tend to show extreme deviations of the mean (e.g. large price falls) at the same time. There will always be multiple values for the coskewness, increasing with the number of variables.
- Kurtosis, the standardised fourth central moment, also describes how spread out a distribution is, but focuses more on the extreme points (i.e. the tails). Variables with higher kurtosis have fatter tails, and many financial instruments show positive kurtosis. Excess kurtosis, referring to the normal distribution, is calculated by subtracting 3. Kurtosis=(E[(X-μ)^4])/σ^4 =(n(n+1))/((n-1)(n-2)(n-3)) ∑▒((x_i-μ ̂)/σ ̂ )^4 .
- Cokurtosis, the fourth cross central moment, helps to identify whether two variables will tend show extreme values at the same time, and is this closely related to coskewness. As for coskewness, there are multiple values of cokurtosis, increasing with the number of variables.
Describe and interpret the best linear unbiased estimator.
The best linear unbiased estimator (BLUE) is the estimator with the lowest variance for an unknown population metric (e.g. mean) that can be represented in a linear function and is unbiased. Unbiased refers to the estimator’s expected value being equal to the true value of the unknown metric.
Uniform distribution
The uniform distribution resembles a constant probability density function between a defined lower (b1) and upper (b2) bound. For the bounds being 0 and 1, respectively, the distribution is called standard uniform distribution.