Networks Flashcards

1
Q

What are two alternatives to the Relu function?

A

ELU(Exponential linear units) and leaky Relu.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is the difference between SGD with momentum and SGD with Nestrov momentum?

A

Nestrov momentum adds the velocity to parameter before computing gradients

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the idea behind Adagrad?

A

Adagrad reduces the learning rate of gradient dimensions with large square value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the idea behind RMS prop?

A

Adagrad reduces the learning rate of gradient dimensions with large running mean square value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the idea behind Adam optimizer?

A

Combines SGD with momentum and RMS prop.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Name some regularizers

A

L1, L2, Early stopping, Dropout, constrain max norm, data augmentation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Name two commonly used initialization methods for weights in a Neural network

A

w = N(0, 1/sqrt(N)) or Xavier: U(-1/sqrt(N), 1/sqrt(N))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the formula for batch normalization?

A
z_hat = (z - mu)/sigma
z_new = y*z_hat + b, where y and b are learnt parameters.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the ouput size after a convolutional layer

A

out = (in + 2*pad - filter_size)/stride + 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why are deep networks harder to train and how can we solve this?

A
  1. Vanishing gradient:
    - Relu
    - Good initializaiton
    - Auxiliary classifiers
  2. Covariate shift:
    - Batch normalization
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the idea behind the inception net?

A

Each layer tries several filter sizes and the networks “learns” the best filter size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly