Backprop Flashcards

Question 1

Q

What is backprop?

Answer

A

message passing
schedule for computing weight updates (according to gradient descent) in layered networks of neurons of any depth
any layer can be seen as an independent processor passing messages forward and backwards

Question 2

Q

Complexity of Backprop

Answer

A

O|w|

Large networks possible - 10^4-10^6 weights

Works on DAGs of continuous units
Loops acceptable with a time-delay

Question 3

Q

Complexity - loading problem

Answer

A

NP-complete

Question 4

Q

Expressive power of MLP

Answer

A

A single hidden-layer network can approximate every input-output mapping, provided there is enough units in the hidden layer

Question 5

Q

VC dimension of SLP

Question 6

Q

VC dimension of MLP

Answer

A

2(n+1)*M(1+logM)

Question 7

Q

Backprop in 1-2 sentences

Answer

A

The backprop procedure to compute the gradient of an objective function wrt weights of a multilayer stack of modules is nothing more than a practical application of the chain rule for derivatives. (LeCun, Bengio, Hinton, 2015).

Key insight: the derivative (or gradient) of the objective wrt the input of a module can be computed by working backwards from the gradient with respect to the output of that module.

Can be applied repeatedly to propagate the gradients through all modules, starting from the output at the top (where the net produces its predictions) all the way to the bottom (where the external data is fed).

Question 8

Q

What are hidden layers?

Answer

A

Distort the input in a non-linear way so that the categories become linearly separable by the last layer

Backprop Flashcards

(8 cards)