Long Short-Term Memory (LSTM) Flashcards
Long Short-Term Memory (LSTM)
Long Short-Term Memory (LSTM) is a type of Recurrent Neural Network (RNN) and is used to process entire sequences of data.
- Introduction
LSTM is an artificial recurrent neural network (RNN) architecture used in the field of deep learning. It was developed to deal with the exploding and vanishing gradient problems that can be encountered when training traditional RNNs.
- Memory Cells
Key to the LSTM’s ability to effectively learn from sequence data is the memory cell which maintains its state over time, and a series of gates that control when the cell updates its state and when it outputs its value.
- Gates
LSTMs have three types of gates input gate (decides how much information from current input should be stored in the cell state), forget gate (decides how much information should be discarded from the cell state), and output gate (decides what the next hidden state should be).
- Handling Long Sequences
LSTMs are explicitly designed to avoid the long-term dependency problem. They can remember information for long periods of time, which is an advantage when dealing with sequences and lists.
- Backpropagation Through Time (BPTT)
Like other RNNs, LSTMs use BPTT for training, but they also have a unique way of connecting old information to the present task, which helps in learning from sequences of data.
- Use Cases
LSTMs are used on a variety of time series data applications including univariate time series forecasting, multivariate time series forecasting, and even natural language processing tasks.
- Variations
There are several variations of LSTM like Bi-directional LSTM, Peephole LSTM, and Gated Recurrent Units (GRU). Each has its specific structure and use-cases, but all aim to capture the temporal dependencies of different types and lengths in sequence data.
- Challenges
Despite their success, LSTMs can be difficult and time-consuming to train, and they require a significant amount of computational resources. They may also struggle with tasks that require more complex forms of memory, such as reasoning over facts or constructing complex plans.