RNN & Attention Flashcards
Neural Nets
What is a Feedforward Neural Net?
n features -> W matrix (dxn) -> non linear func -> d hidden units -> V matrix (Cxd) -> softmax -> C probabilities
Representing text by summing/averaging word embeddings
Neural Nets
Word Embeddings in NNs
Similar input words get similar vectors
Neural Nets
W Matrix in NNs
Similar output words get similar words in softmax matrix
Neural Nets
Hidden States in NNs
Similar contexts get similar hidden states
Neural Nets
What problems ARE handled by NNs? (vs count-based LMs)
- Can share strength among similar words and contexts
- Can condition on context with intervening (interconnected) words
Neural Nets
What problems ARE NOT handled by NNs?
Cannot handle long-distance dependencies
Recurrent Neural Networks
Sequential Data Examples
- Words in sentences
- Characters in words
- Sentences in sample
Recurrent Neural Networks
Long Distance Dependencies Examples
- Agreement in number, gender
- Selectional preference (determine meaning of rain/reign using clouds/queen)
Recurrent Neural Networks
Recurrent Neural Networks
- Retains information from previous inputs due to the use of hidden states
- Designed to process sequential data, where the order of the data matters (FFNN treats inputs independently)
- At each step, RNN takes in current input with the hidden state from the previous step and updates the hidden state (“remembers”)
Recurrent Neural Networks
Unrolling in Time
- RNN updates hidden vector upon each input,
- Unrolling means breaking down “cell” into multiple copies
- Make it easier to see how network processes sequence step-by-step
Recurrent Neural Networks
What can RNNs do?
- Sentence classification
- Conditional generation
- Language modeling
- POS tagging
Recurrent Neural Networks
Sentence Classification
Read whole sentence and represent it
Recurrent Neural Networks
Conditional Generation
Use sentence representation to make prediction
Recurrent Neural Networks
Language Modeling
Read context up until a point and represent the context
Recurrent Neural Networks
POS Tagging
Use sentence and context representation to determine the POS of a word