Formulas Flashcards by Unknown Unknown

Q

What is the formula for mean squared error?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the sigmoid function?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the function for the hyperbolic tangent?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What are the two equations used to update the weights using Momentum?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the equation for the running average of the gradients used in Adam?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the equation for the squared gradients used in Adam?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

How is each parameter updated when using Adam?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the Bayes Rule?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the formula for the entropy of a discrete probability distribution?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the formula for KL-divergence for two probability distributions?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the formula for the entropy of a continuous probability distribution?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the formula for the KL-divergence of a continuous probability distribution?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the entropy of a Gaussian Distribution?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the entropy of a d-dimensional Gaussian distribution?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the KL-divergence between two d-dimensional multivariate Gaussian Distributions?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the Wasserstein difference for two multivariate Gaussian Distributions?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the cross entropy error for a binary classification task?

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the Gaussian Distribution equation?

Study These Flashcards

A

Q

What is the multivariate Gaussian Distribution equation?

Study These Flashcards

A

Q

For softmax, what is Prob(i)?

Study These Flashcards

A

Q

For softmax what is log Prob(i)

Study These Flashcards

A

Q

What is the equation for the gradient using softmax?

Study These Flashcards

A

Q

What is the weight decay equation?

Study These Flashcards

A

Q

What is the formula for the mean?

Study These Flashcards

A

What is the formula for the variance?

What is the equation used for batch normalisation?

What is the loss function for Neural Style Transfer?

What is the true value (V*) of the current state?

What is the formula for Q*(s,a)?

What is the formula for the fitness of a policy?

What are we trying to minimise with Double Q-learning?

What is the formula for the Advantage function?

What is the equation for GANs?

What is the threshold activation function (context of this course)

Greater than the bias = 1, less than the bias = 0

In a deterministic environment, with a learning rate of 1, what is the Q-learning update rule?

Write the formula for activation Z of the node at location (j, k) in the ith filter of a Convolutional Neural Network which is connected by weights K to all nodes in a M x N window from the L filters (or channels) in the previous layer, assuming the bias weights are included in the activation function g().

What is the formula for the number of free parameters in a CNN layer?

F x (1 + L x M x N ), where F is the number of filters, M x N is the filter size, and L is the number of channels

What are the functions that describe a LSTM?

What is the Variational Auto Encoder trained to maximise?

For GANs, what is the formula for V(G, D)?

What is the formula for number of weights per filter?

(filter width) x (filter height) x (input depth) + 1 (for bias)

Number of neurons in this layer?

(output width) x (output height) x (depth)

Number of connections into the neuron's in a layer?

(num neurons) x (connections per neurons / filter wights - bias)

Number of independent parameters?

(num filters) x (num weights)