Seq2Seq Flashcards
What is encoder-decoder?
Encoder-decoder is a sequence to sequence model.
It comprises of RNN, LSTM or GRUs.
Encoder processes the input sequence by finding the contextual meaning between the words.
Decoder takes the output of encoder as a context vector and then returns the output sequence.
What was the problem with Encoder-Decoder?
Was not able to retain the information for longer sequences (>30).
Timeline of Seq-Seq model
2014 - Encoder-Decoder - Seq-to-Seq learning with Neural Network
2015 - Attention - Neural Machine Translation
2017 - Transformer - Attention is all u need
Teacher forcing
Teacher forcing is a strategy for training RNNs that uses ground truth as input, instead of model output from a prior time step as input.
Improvement in Encoder-Decoder
- Using advanced embeddings technique
- Deep LSTMs
- Reversing the input
Summarize Encoder-Decoder research paper.
- Machine translation - English to French
- Dataset - 12M sentences; 300+M words each language
- Reversed input improved results
- Embedding - 1,000 dimensional
- Deep LSTM - 4 layers
- Softmax output
- BLEU score - 34.81
Time distributed FCN
Same weights are applied on each input at time t, t+1, t+2 and so on
Bahdanau attention
Also known as Additive attention
Alignment model in attention mechanism
Luong attention
What is self attention?
Self attention is a mechanism that can take static embedding as a input and can generate conceptual embedding.
Alignment score
What was the problem with attention mechanism?
Though, it was able to capture the long term dependency by using attention technique. But, since each words were processed sequentially, it still was computationally expensive.
Why is language modelling preferred as a pre training task?
- Rich feature learning
- Unsupervised task
How is ChatGPT trained?
Reinforcement Learning with Human Feedback - Human ranked responses