Gaussian Processes Flashcards

1
Q

What is a Gaussian Process?

A

A generalization of the multivariate Gaussian distrubution to infintly any variables. Formally, a Gaussian process is a collection of random variables, any finit number of which is Gaussian distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the formula for posterior mean in a gaussian process?

A

m_post() = m() + k(, X)(k(X, X) + sigma^2I)^(-1)(y-m(X))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the formula for posterior cov in a gaussian process?

A

k_post(, ) = k(, ) - k(, X)(k(X, X) + sigma^2I)^(-1)k(X, *)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can we create new covariance function?

A

If k1, k2 are covariance funcitons and u(x) is a transform of the input space, then

1) k1 + k2
2) k1*k2
3) k1(u(x), u(x’))

Are also covariance functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name some parameters of the GP.

A

Parameters of the mean and covariance function and the noise variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can we choose hyperparameters?

A

Maximize the marginal liklehood with f integrated out (Called Maximum liklehood type II).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What three local optima do we often get when minimizing the marginal liklehood, especially with little data?

A
  1. High noise variance, long length scale (almost linear)
  2. Medimum noise, medium length scale
  3. Low noise, low length scale (Higly non-linear)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do we need to fully specify a Gaussian process?

A

Mean and covariance function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why don’t we use maximum liklehood or MAP to find hyperparameters?

A

This will lead to overfitting as it is possible to set f(X) = y and letting the noise go to 0. The marginal likelihood does not fit function values, but integrates them out so overfitting can’t happen in the same way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What properties does a covariance (kernel) function have?

A

Symetric and postive semi-definit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do Gaussian processes scale in the training points with respect to training, prediction andd memory requirement?

A

O(N^3), O(N^2), O(ND + N^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly