Handout #8 - Recurrent Neural Networks Flashcards

Question 1

Q

Explain what can be used to train a RNN

Answer

A

Back-Propogation Through Time (BPTT)

Unroll the network to expand it into a standard feedforward network and then apply back-propogation as per usual.

Question 2

Q

What’s the problem with BPTT

Answer

A

The unrolled network can grow very large and might be hard to fit into the GPU memory.

Process is seq. -> can’t be parallelised.

Question 3

Q

What’s the problem with the Simple RNN layer

Answer

A

RNN can grow very deep -> gradient descent can vanish (or explode) very quickly.

Question 4

Q

What time of data is RNN’s used for?

Answer

A

It’s used on sequential data -> any data with tie series (e.g. audio signal, stock market, machine translation)

Question 5

Q

Is an RNN a feedforward network?

Answer

A

Not, it’s cyclic

Question 6

Q

Explain why LSTM is useful

Answer

A

Deals with the exploding and vanishing gradient problem (when unrolling the network).

LSTM has three gates; forget gate, input gate and output gate.

Question 7

Q

Explain GRU

Answer

A

A simpler alternative to LSTM -> faster to train.

Instead of linear combination (w1u1 + w2u2), the gating mechanism is based on a multiplication of both inputs.

Question 8

Q

What is the critical issue with RNN

Answer

A

They aren’t suitable for transfer learning.

Can’t do stuff in parallel.

(8 cards)