Backprop Flashcards

1
Q

What is backprop?

A
  • message passing
  • schedule for computing weight updates (according to gradient descent) in layered networks of neurons of any depth
  • any layer can be seen as an independent processor passing messages forward and backwards
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Complexity of Backprop

A

O|w|

Large networks possible - 10^4-10^6 weights

Works on DAGs of continuous units
Loops acceptable with a time-delay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Complexity - loading problem

A

NP-complete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Expressive power of MLP

A

A single hidden-layer network can approximate every input-output mapping, provided there is enough units in the hidden layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

VC dimension of SLP

A

n+1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

VC dimension of MLP

A

2(n+1)*M(1+logM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Backprop in 1-2 sentences

A

The backprop procedure to compute the gradient of an objective function wrt weights of a multilayer stack of modules is nothing more than a practical application of the chain rule for derivatives. (LeCun, Bengio, Hinton, 2015).

Key insight: the derivative (or gradient) of the objective wrt the input of a module can be computed by working backwards from the gradient with respect to the output of that module.

Can be applied repeatedly to propagate the gradients through all modules, starting from the output at the top (where the net produces its predictions) all the way to the bottom (where the external data is fed).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are hidden layers?

A

Distort the input in a non-linear way so that the categories become linearly separable by the last layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly