Week 2 Flashcards

1
Q

What do we call the Gaussian distribution if we generalise it to define a density function over continuous vectors?

A

Multivariate Gaussian distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we define the multivariate Gaussian density for a vector x = [x.1, …, x.D]^T ?

A

p(x) = (1 / ( (2*pi)^(D/2) |SIG| ^0.5)) * exp { -0.5 * (x-mu)^T * SIG^-1 * (x-mu) }

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the mu in the formula for the multivariate Gaussian density?

A

The mean, a vector of the same size as vector x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the d-th element of mu tell us in the formula for multivariate Gaussian density?

A

The mean value of x.d

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the form of the variance, the SIG, in the formula for multivariate Gaussian density?

A

a DxD covariance matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

I^-1 ==
(identity matrix)

A

I

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

I * matrix ==
(identity matrix I)

A

matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The exp of a sum gives the same result as…

A

the product of exps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

|A| is the … of matrix A

A

determinant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you calculate the determinant of a 2x2 matrix A:
[ a b
c d ]

A

|A| = ad - bc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is |I|?

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

(2*pi)^(D/2) can be written as…

A

PRODUCT(d=1 to D) of (2*pi)^(1/2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Tr(A) for matrix A?

A

the trace of a square matrix A, the sum of the diagonal elements of A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

If A = I.D, so the DxD identity matrix, then Tr(I.D) =

A

SUM(d=1 to D) 1 = D

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Tr(AB) ==

A

Tr(BA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Tr(w^T * w) ==

A

w^T * w,
since w^T * w results in a scalar.

17
Q

What does the binomial distribution describe?

A

The probability of a certain number of successes in N binary events.

18
Q

How do you calculate the probability of y successes in N tosses with probability r per toss and a binary event?

A

P(Y = y ) = (N over y) * r^y * (1-r)^(N-y)

19
Q

When is a likelihood-prior pair conjugate?

A

If it results in a posterior of the same form as the prior.

20
Q

Suppose we have data generated by a model with distribution t ~N(X.w, sig^2). What does this say about X.w?

A

X.w is the mean vector, it gives the t we should get for every x-element. Every element in X.w is a Gaussian random variable, but X.w as a whole is a multivariate Gaussian.

21
Q

What can you do in the equation
E[w^] = E[ (X^T * X) ^-1 * X^T * t] if it’s a linear function?

A

You can change the order in which you take the expectation and apply the linear function, so:
= (X^T * X) ^-1 * X^T * E[t].

22
Q

What is the expectation of a multivariate normal distribution?

A

The mean.

23
Q

Why do (X^T * X)^-1 and (X^T * X) cancel each other out?

A

They are inverses of each oter.

24
Q

When do we call an estimator x unbiased?

A

When it has the property that
E[x^] = x.

25
Q

cov[w^] ==

A

sig^2 * (X^T * X) ^-1

26
Q

The cov[w^] matrix is the inverse of…

A

the Hessian matrix of 2nd partial derivatives.

27
Q

What does a large covariance mean in the cov[w^] matrix?

A

That we’re uncertain.

28
Q

In linear regression, if there is a negative covariance between w.0^and w.1^, then

A

there is a dependence such that if one goes up, the other goes down.

29
Q

P(Y.N = y.N | R=r) ==

A

(N over y.N) * r^y.N * ((1-r)^(N-y.N))

30
Q

What do we need to compute the joint distribution of r and y p(r,y.N)?

A

P(Y.N = y.N | R=r) and p(r). So the conditional distribution of Y.N given R and the density of r, the prior.

31
Q

p(r|Y.N = y.N ) ==

A

p(r,y.N) / P(Y.N = y.N)

32
Q

p(r, y.N)

A

P(Y.n=y.N | R=r) * p(r)