Formulas Flashcards

1
Q

What is the formula for mean squared error?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the sigmoid function?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the function for the hyperbolic tangent?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the two equations used to update the weights using Momentum?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the equation for the running average of the gradients used in Adam?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the equation for the squared gradients used in Adam?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is each parameter updated when using Adam?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the Bayes Rule?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the formula for the entropy of a discrete probability distribution?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the formula for KL-divergence for two probability distributions?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the formula for the entropy of a continuous probability distribution?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the formula for the KL-divergence of a continuous probability distribution?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the entropy of a Gaussian Distribution?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the entropy of a d-dimensional Gaussian distribution?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the KL-divergence between two d-dimensional multivariate Gaussian Distributions?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the Wasserstein difference for two multivariate Gaussian Distributions?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the cross entropy error for a binary classification task?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the Gaussian Distribution equation?

A
19
Q

What is the multivariate Gaussian Distribution equation?

A
20
Q

For softmax, what is Prob(i)?

A
21
Q

For softmax what is log Prob(i)

A
22
Q

What is the equation for the gradient using softmax?

A
23
Q

What is the weight decay equation?

A
24
Q

What is the formula for the mean?

A
25
Q

What is the formula for the variance?

A
26
Q

What is the equation used for batch normalisation?

A
27
Q

What is the loss function for Neural Style Transfer?

A
28
Q

What is the true value (V*) of the current state?

A
29
Q

What is the formula for Q*(s,a)?

A
30
Q

What is the formula for the fitness of a policy?

A
31
Q

What are we trying to minimise with Double Q-learning?

A
32
Q

What is the formula for the Advantage function?

A
33
Q

What is the equation for GANs?

A
34
Q

What is the threshold activation function (context of this course)

A

Greater than the bias = 1, less than the bias = 0

35
Q

In a deterministic environment, with a learning rate of 1, what is the Q-learning update rule?

A
36
Q

Write the formula for activation Z of the node at location (j, k) in the ith filter of a Convolutional Neural Network which is connected by weights K to all nodes in a M x N window from the L filters (or channels) in the previous layer, assuming the bias weights are included in the activation function g().

A
37
Q

What is the formula for the number of free parameters in a CNN layer?

A

F x (1 + L x M x N ), where F is the number of filters, M x N is the filter size, and L is the number of channels

38
Q

What are the functions that describe a LSTM?

A
39
Q

What is the Variational Auto Encoder trained to maximise?

A
40
Q

For GANs, what is the formula for V(G, D)?

A
41
Q

What is the formula for number of weights per filter?

A

(filter width) x (filter height) x (input depth) + 1 (for bias)

42
Q

Number of neurons in this layer?

A

(output width) x (output height) x (depth)

43
Q

Number of connections into the neuron’s in a layer?

A

(num neurons) x (connections per neurons / filter wights - bias)

44
Q

Number of independent parameters?

A

(num filters) x (num weights)