RNN MLM Flashcards
Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs) are a powerful and flexible class of neural networks that are great for modeling sequence data such as time series or natural language.
- Introduction
RNNs are neural networks that are designed to work with sequential data by passing information from one step in the sequence to the next. They achieve this through loops in the network that allow information to be carried across steps.
- Structure
An RNN takes in an input vector (which can represent a word, a part of an image, a time series data point, etc.) and a hidden state vector which is a representation of previous inputs. It combines these to produce a new hidden state which can be used to determine the output and is also passed on to the next step.
- Sequential Processing
One of the key features of RNNs is their ability to process inputs of different lengths and to share parameters across different parts of the model. This allows them to learn and generalize well from different parts of the input sequence.
- Backpropagation Through Time (BPTT)
Training RNNs involves a technique called backpropagation through time, which is essentially the application of regular backpropagation to the unrolled computational graph of the RNN. This allows the network to update its weights based on future states in the sequence.
- Challenges
Despite their flexibility and power, RNNs have some challenges. The most notable is the vanishing gradient problem, which makes it difficult for RNNs to learn and represent long-term dependencies in the data. Another issue is that training RNNs can be quite slow due to the sequential nature of their computation.
- Variations
There are several variants of RNNs designed to combat these challenges. Long Short-Term Memory units (LSTMs) and Gated Recurrent Units (GRUs) are two of the most popular. They introduce gates and additional hidden state pathways to help model the longer-term sequences and dependencies that vanilla RNNs struggle with.
- Applications
RNNs have a wide range of applications including language modeling (such as generating text), machine translation, speech recognition, image captioning, and even music generation.
- Strengths and Limitations
RNNs are very powerful and flexible and are the natural and best choice for many sequence tasks. However, their training can be slow and difficult, especially for longer sequences, and they may be outperformed by models such as transformers on some tasks.