Bayesian Deep Learning Flashcards

Question 1

Q

What is the motivation for a Bayesian approach in deep learning

Answer

A

Not only classification and the certainty of a specific class compared to the other classes, but also the uncertainty of that certainty. The uncertainty in your parameters is propagated into the uncertainty of the predictions.

Simply put, a statistical approach to model our network certainty. Useful in any safety critical neural network applications.

Question 2

Q

How is uncertainty modeled in Bayesian classification?

Answer

A

The uncertainty of prediction/classification is the variance of p (y|x,w)

Question 3

Q

Mention some other applications where a Bayesian approach can be used.

Answer

A

Natural interpretation for regularization
Model selection
Input data selection (active learning)

Question 4

Q

Why does Narada go on a mathematical spree in his lecture when using bayesian approach on deep learning?

Answer

A

The denominator (integral[ (P(Y|X,w)*p(w) ] dw) of the bayesian approach is intractable. Want to find methods for estimating this denominator.

Question 5

Q

Mention some approaches to estimate the denominator of the bayesian approach

Answer

A

Monte Carlo techniques (MCMC - Markov Chain Monte Carlo)
Variational inference
Introducing random elements in training (dropout)

Question 6

Q

Why do we bother trying to calculate p(w|D) when we are actually interested in the variance of p(y|x,w)?

Answer

A

Because p(y|x,w) is both directly dependent on p(w|D) and indirectly through the mean.

Question 7

Q

What is the goal of ELBO?

Answer

A

Same as the other approaches (monte carlo approaches), i.e find p(w|D)

Question 8

Q

What is the idea of ELBO?

Answer

A

Find a distribution q(w) such that q(w) is approximately p(w|D)

Question 9

Q

formulate the ELBO approach (high level formula)

Answer

A

arg min (with respect to q(w)) KL(q(w) || p(w|D)). I.e find the parameters of q such that q(w) and p(w|D) are similar.

Question 10

Q

What is the problem with ELBO?

Answer

A

ELBO uses p(w|D) to approximate p(w|D) in the high level formulation

Question 11

Q

How do we actually calculate ELBO?

Answer

A

Find that the KL similarity and another term ( E_{q(w)}ln(p(w,D)/q(w) ) sum to a constant ( ln p(D) ). Thus maximising the first term, i.e the above noted term is the same as minimising the KL similarity.

Question 12

Q

Mention and explain a practical bayesian approach for modeling uncertainties in your network.

Answer

A

Apply dropout to the testing phase of a neural network. Feed (a) input data several times through the network and calculate the variance. This approach can apparently be viewed as a bayesian approach. Perhaps because we can view dropout as sampling the weight space, in particular most of our samples will be close to our local minima i.e sampling p(w|D).

Question 13

Q

What is the idea behind Monte Carlo techniques, and what are their problems?

Answer

A

We cannot summerize over all possible weights, so we sample instead. Problem is that often, samples reflects non-important areas of the posterior distribution.

Question 14

Q

Mention and explain a practical bayesian approach for modeling uncertainties in your network.

Answer

A

Apply dropout to the testing phase of a neural network. Feed (a) input data several times through the network and calculate the variance. This approach can apparently be viewed as a bayesian approach.

Bayesian Deep Learning Flashcards

lecture 11 (14 cards)