Lecture 4: Covariance Matrix Estimation Flashcards

Question 1

Q

Empirical Covariance matrix

Answer

A

p x n data
A = [a1..an]
Each row represents a log-return time series

C = 1/n * Sum[(ai - a_hat)*(ai-a_hat)^T]

where a_hat = average of all a (log-return time series)

Question 2

Q

How do find an estimate of Σˆ of the true Σ based on the datapoints x1, x2..xn?

Answer

A

We should try to maximize the likelihood

L(Sigma) = Product [prob(Sigma, X)]

Changing variables (X = Sigma Inverse) and taking the log of the likelihood the problem can be written as

Max X (log det X) - Trace( CX)

Question 3

Q

What is wrong with the empirical estimate?

Answer

A

The approach fails when the covariance matrix is not positive. When P > n (when there are more assets than observations). It does not handle the missing data, it has high sensitivity to outliers. Hence we can come up with a better estimate.

Question 4

Q

How do we measure the estimation quality?

Answer

A

Apply cross-validation principle.
Remove 10% of the data
Record new estimates
Measure average “error” between estimates

How do we measure errors? Introduce a concept of distance between matrices, which is capture by Frobenius norm (Square root of sum of squares entries).

Question 5

Q

Sparse Graphical Models

Answer

A

If given prices of many assets - we like to draw a graph that describes links between the prices.

Conditional independence - The pair of random xi and xj are conditionally independent if for xK fixed (k is not equal to i,j) the density can be factored:

p(x) = pi(xi)*pj(xj)

the variables xi and xj are conditionally independent iff the i, j elements of the precision matrix is zero (Sigma^-1)ij = 0

Question 6

Q

Sparse Precision matrix estimation

Answer

A

In the maximum-likelihood estimation problem -

max X [log det X - Trace C_hat X - Lambda*Norm(X,1)]

The above provides an invertible results even if C_hat is not a positive definite, the problem is convex

Lecture 4: Covariance Matrix Estimation Flashcards

(6 cards)