Week 2 (Neural Nets) Flashcards

1
Q

What is a perceptron (and what activation function does it use)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a MLP (and how is the activation function different)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What things need to be defined for an MLP

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does the NN forward pass step work?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the NN backward pass step?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is the error calculated for a neural network and how is this used to calculate the gradient

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is the gradient calculated for hidden layers (ie back propagation)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Answer this (neural network gradient):

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does a_j, z_j and h represent in neural networks

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the typical delta_k for a neural network, ie the gradient on the last layer. Then generalise, what is the formula for delta_j

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the backpropagation formula for delta_j

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the general backgropagation playbook

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can back propagation be made more efficient

A

Store the previous gradients (ie gradients closer to the output layer) as they are reused on earlier layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does gradient descent work

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are some methods to reduce overfitting of neural networks

A

Dropout, early stopping, regularisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the formula for the sigmoid function

A
17
Q

What is the vanishing and exploding gradient problem

A
18
Q

What is gradient clipping and what is it used for

A
19
Q

What are saturating and non-saturating activation functions

A
20
Q

What are residual connections

A
21
Q

How does early stopping work

A
22
Q

Does parameter initialisation matter?

A

Yes

23
Q

How does weight decay work (and what is the L2 regularisation term)

A
24
Q

How does weight decay work

A
25
Q

Where are bias values stored in neural networks

A

In the weight matrix (they multiply with a constant 1 neuron)