ANN Lecture 8 - Recurrent Neural Networks Flashcards

1
Q

What is the motivation for a recurrent neural network?

A
  • A feed forward neural network only process and return single entities or objects
  • The length of data sequences can vary
  • Models need some kind of internal memory to process the whole sequence in context
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Examples of sequential data

A
  • audio data
  • text data
  • temporal data e.g. stock market
  • video data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Recurrent neural network

A

A ANN that allows feed-back connections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Feed-back connection

A

Connection connecting a layer to itself or even to earlier layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Vanila RNN

A
  • RNN processes sequential data
  • input, hidden state and output can be high dimensional.
  • weights and biases are used for each input of the sequence
  • for each input the model produces an output
  • in each iteration the hidden state is fed into itself (memory)
  • there are biases at the hidden and the output layer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Vanila RNN Algorithm

A
  • Initialize weights
  • Initialize bias with zeros
  • Initialize hidden state H0 with zeros
newHiddenState = 
activationFunction(
Input * WeightsXH +
old HiddenState * WeightsHH +
BiasH)

output = outputFunction(
newHiddenState * WeightsHY + BiasY)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Backpropagation through time

A

The overall loss is the mean over all the losses at each timestep.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Truncated backpropagation through time - Motivation and Explanation

A

Motivation:

  • If the the Input sequence is very long, unfolding the RNN results in a very deep neural network
  • If backpropagating through a very deep network gradients in early layers tend to either vanish or explode

TBPTT:

  • Cutting the original sequence of length N into (N-n) n-long sub sequences, with own target sub sequences.
  • First sub sequence gets a hidden state initialized with zeros fed in
  • Next sub sequence gets the first hidden state of previous sub sequence fed in
How well did you know this?
1
Not at all
2
3
4
5
Perfectly