C2W3 Hyperparameters Tuning Flashcards

Question 1

Q

Tuning order

Answer

A

Learning rate
Beta(momentum term), mini batch size, hidden units
Number of layers, learning rate decay
ADAM parameters (beta1 0.9, beta2 0.999, epsilon 10-8)

Question 2

Q

How to tune learning rate

Answer

A

Use logarithmic scaling (10^-4, … 10^0)

Question 3

Q

Pandas vs caviar approach to training model

Answer

A

Pandas: 1 model at the time, caviar: multiple models at the time

Question 4

Q

Batch normalisation

Answer

A

Normalising values deep in hidden layers, normalise mean and variance of inner Z.
This eliminates usage of B because of mean computing, instead Beta is used

Question 5

Q

Batch norm at test time

Answer

A

Save M and Alpha for each mini batch (when train).
Compute exponential weighted average of them. (Estimated)

At test use it for scaling hidden units values when evaluating test example

Question 6

Q

Soft Max regression

Answer

A

Soft Max layer is the latest layer in NN, with number of units equal to number of classes.
You apply softmax activation function to Zs of this layer

Question 7

Q

Deep learning frameworks

Answer

A

Caffe/Caffe2
CNTK
DL4J
Keras
Lasagne
mxnet
PaddlePaddle
TesorFlow
Theano
Torch

C2W3 Hyperparameters Tuning Flashcards

(7 cards)