ANN Lecture 10 - LSTMs Flashcards
1
Q
Motivation of LSTMs
A
- The vanilla RNN can only capture short term dependencies but isnot able to capture longer dependencies.
- If you unfold it, you will get a deep network, where it is hard to backpropagate the errors for those long term dependencies.
- LSTMs is able to capture long term dependencies
2
Q
LSTM - Cell State
A
- The cell state is the major enhancement of the model, representing the “long-term memory”
- The cell state receives only two linear operations, which makes it easy for information to flow through
- Two gates regulate which information is added or removed from the cell state
3
Q
LTSM - Gates
A
- Gates consist of a fully-connected feed-forward layer with logistic activation function + a pointwise multiplication operation
- Sigmoid activation is between 0 and 1:
- Activation close to 0: block the information
- Activation close to 1: let the information pass
- LSTM has 3 gates:
- Forget gate (cell state)
- Input gate (cell state)
- Output gate
4
Q
LTSM - Forget Gate
A
- Forget gate takes as input the input and the previous
hidden state - It regulates, which information of the cell state is removed
- Sigmoid activation
- Parameters: Weights and Bias
5
Q
LTSM - New Candidate for Cell State
A
- Create a new candidate for the cell state update from
input and previous hidden state - TanH activation
- Parameters: Weights and Bias
6
Q
LTSM - Input Gate
A
- Input gate takes as input the input and the previous
hidden state - Input gate filters the candidate before the cell
state is updated - Sigmoid activation
- Parameters: Weights and Bias
7
Q
LTSM - Update of Cell State
A
- Multiply cell state with forget gate to remove information
- Add filtered new candidate to cell state to add new
information
8
Q
LTSM - New Candidate for Hidden State
A
- Create a candidate for hidden state from the cell
state - TanH activation
9
Q
LTSM - Output Gate
A
- Output gate takes as input the input and the previous hidden state
- Output gate regulates which information of the new
candidate for the hidden state can pass - Sigmoid activation
- Parameters: Weights and Bias
10
Q
LTSM - Update of the Hidden State
A
- Update the hidden state by filtering the candidate for the hidden state
- Multiply output gate with the new candidate for hidden state