Block 3: Recurrent Neural Networks (RNNs) Flashcards
What is a key feature that distinguishes RNNs from other neural networks?
A key feature of RNNs is their ability to maintain a ‘memory’ of previous inputs in the network’s internal state.
How do RNNs handle the temporal dependencies in data?
RNNs handle temporal dependencies by using their internal state (memory) to process sequences of inputs.
What are some common applications of RNNs?
Common applications include speech recognition, language translation, and time series prediction.
Describe a challenge often encountered with RNNs.
A common challenge with RNNs is the vanishing gradient problem, which makes training deep RNNs difficult.
Explain the vanishing gradient problem in RNNs
The vanishing gradient problem occurs when gradients shrink as they backpropagate through time, making it hard to learn long-range dependencies.
What are the advances made to overcome the limitations of traditional RNNs?
Advances include LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Units), which have mechanisms to better capture long-term dependencies.
How do RNNs differ in processing data compared to CNNs?
Unlike CNNs, which are ideal for spatial data, RNNs are designed for sequential data, processing inputs over time and maintaining a memory of past information.
What is the basic building block of an RNN?
The basic building block of an RNN is a recurrent unit or cell that processes one input at a time while maintaining information about previous inputs through its internal state.
Do RNNs share weights across different time steps?
Yes, in RNNs, the same weights are applied across all time steps, which helps in learning temporal patterns.
How do RNNs handle variable-length input sequences?
RNNs can handle variable-length inputs by processing one element of the sequence at a time until the entire sequence is consumed. This makes them flexible for different lengths of input data.
What is the significance of the hidden state in an RNN?
The hidden state in an RNN acts as a form of memory. It captures information about previous inputs, allowing the network to make informed predictions based on past data.