Week 2 Flashcards by Jedidja Marsman

What do we call the Gaussian distribution if we generalise it to define a density function over continuous vectors?

Multivariate Gaussian distribution.

How well did you know this?

Not at all

Perfectly

How do we define the multivariate Gaussian density for a vector x = [x.1, …, x.D]^T ?

p(x) = (1 / ( (2*pi)^(D/2) |SIG| ^0.5)) * exp { -0.5 * (x-mu)^T * SIG^-1 * (x-mu) }

How well did you know this?

Not at all

Perfectly

What is the mu in the formula for the multivariate Gaussian density?

The mean, a vector of the same size as vector x.

How well did you know this?

Not at all

Perfectly

What does the d-th element of mu tell us in the formula for multivariate Gaussian density?

The mean value of x.d

How well did you know this?

Not at all

Perfectly

What is the form of the variance, the SIG, in the formula for multivariate Gaussian density?

a DxD covariance matrix

How well did you know this?

Not at all

Perfectly

I^-1 ==
(identity matrix)

How well did you know this?

Not at all

Perfectly

I * matrix ==
(identity matrix I)

matrix

How well did you know this?

Not at all

Perfectly

The exp of a sum gives the same result as…

the product of exps

How well did you know this?

Not at all

Perfectly

|A| is the … of matrix A

determinant

How well did you know this?

Not at all

Perfectly

How do you calculate the determinant of a 2x2 matrix A:
[ a b
c d ]

|A| = ad - bc

How well did you know this?

Not at all

Perfectly

What is |I|?

How well did you know this?

Not at all

Perfectly

(2*pi)^(D/2) can be written as…

PRODUCT(d=1 to D) of (2*pi)^(1/2)

How well did you know this?

Not at all

Perfectly

What is Tr(A) for matrix A?

the trace of a square matrix A, the sum of the diagonal elements of A

How well did you know this?

Not at all

Perfectly

If A = I.D, so the DxD identity matrix, then Tr(I.D) =

SUM(d=1 to D) 1 = D

How well did you know this?

Not at all

Perfectly

Tr(AB) ==

Tr(BA)

How well did you know this?

Not at all

Perfectly

Tr(w^T * w) ==

Study These Flashcards

w^T * w,
since w^T * w results in a scalar.

What does the binomial distribution describe?

Study These Flashcards

The probability of a certain number of successes in N binary events.

How do you calculate the probability of y successes in N tosses with probability r per toss and a binary event?

Study These Flashcards

P(Y = y ) = (N over y) * r^y * (1-r)^(N-y)

When is a likelihood-prior pair conjugate?

Study These Flashcards

If it results in a posterior of the same form as the prior.

Suppose we have data generated by a model with distribution t ~N(X.w, sig^2). What does this say about X.w?

Study These Flashcards

X.w is the mean vector, it gives the t we should get for every x-element. Every element in X.w is a Gaussian random variable, but X.w as a whole is a multivariate Gaussian.

What can you do in the equation
E[w^] = E[ (X^T * X) ^-1 * X^T * t] if it’s a linear function?

Study These Flashcards

You can change the order in which you take the expectation and apply the linear function, so:
= (X^T * X) ^-1 * X^T * E[t].

What is the expectation of a multivariate normal distribution?

Study These Flashcards

The mean.

Why do (X^T * X)^-1 and (X^T * X) cancel each other out?

Study These Flashcards

They are inverses of each oter.

When do we call an estimator x unbiased?

Study These Flashcards

When it has the property that
E[x^] = x.

cov[w^] ==

sig^2 * (X^T * X) ^-1

The cov[w^] matrix is the inverse of...

the Hessian matrix of 2nd partial derivatives.

What does a large covariance mean in the cov[w^] matrix?

That we're uncertain.

In linear regression, if there is a negative covariance between w.0^and w.1^, then

there is a dependence such that if one goes up, the other goes down.

P(Y.N = y.N | R=r) ==

(N over y.N) * r^y.N * ((1-r)^(N-y.N))

What do we need to compute the joint distribution of r and y p(r,y.N)?

P(Y.N = y.N | R=r) and p(r). So the conditional distribution of Y.N given R and the density of r, the prior.

p(r|Y.N = y.N ) ==

p(r,y.N) / P(Y.N = y.N)

p(r, y.N)

P(Y.n=y.N | R=r) * p(r)

Week 2 Flashcards

(32 cards)