Unit 3 - Recurrent Neural Networks Flashcards
What is recurrent neural network?
Variation to feed forward network
Output from previous step are fed as input to the current step
Talk about traditional neural networks
Less complexity due to using same parameters for each input
Uses a hidden state that stores all the information about the previous steps or a sequence
Example of RNN:
To predict the next word in the sentence, you need to know the previous words of the sentence
Types of RNN:
One to one
One to many
Many to one
Many to many
What is one to one RNN?
Single input and single output
Fixed input and output sizes so works like a traditional neural network
Example: image classification
What is one to many RNN?
Gives multiple outputs for a single input
Fixed input size and gives sequence of data outputs
Example: music generation, image captioning
What is many to one RNN?
Single output is required for multiple input units or sequence of them
Example: sentiment analysis
What is many to many RNN with equal unit size?
Used to generate sequence of output data from a sequence of input units
In this case, both input output units are same in number
Example, name-entity recognition
What is many to many RNN with unequal unit size?
Which net sequence of output data from sequence of input data
This case inputs, an outputs have different number of units
Example: machine translation
What is LSTM?
Variation of RNN
Critical components - memory cell and gates
Contents of memory formulated by forget and input gates
If both gates closed, no change in memory contents
No vanishing gradient problem
Three neural networks used in LSTM:
- Forget network - info not needed is removed
- Remember network - adds the info from input if needed to the state
- Select network - presents the version of internal state as the output
- State - stores the formulated input
LSTM Gates:
- Forget gate - input combined with previous output to generate a fraction 0 and 1 and then multiplies with previous state.
- Input gate - which new info enters state of LSTM. Output of input gate(fraction 0 to 1) multiplied with tan h block and added to previous state.
- Output gate - exposes the current memory state
What is encoder decoder architecture?
Maps input domain to output domain in two stage network:
- Encoder - compresses input into latent space representation.
- Decoder - reconstructs from the latent representation to predict the output
What is sequence to sequence model?
Map fixed length input to fixed length output. Length of input and output may differ.
English to Chinese example
How Seq to Seq model works?
Uses encoder-decoder architecture
Consists of 3 parts:
1. Encoder - has several units of LSTM or GRU cells where each accepts an element of input sequence. Collects info from that element and forwards it .
- Intermediate (encoder) vector - final hidden state from the encoder part of the model. Encapsulates info of all the input elements.
- Decoder - like encoder, has several recurrent units. Predicts output from each unit at each time step. Takes hidden state from previous step and produces output
Applications of Seq to Seq model:
- Machine translation
- Question and answer
- Video captioning
- Speech recognition