RNNS Flashcards

Question 1

Q

What are Recurrent Neural Networks?

Answer

A

RNNS are neural networks designed for processing sequential data.

Question 2

Q

Application areas for RNN?

Answer

A

RNNs are widely used in NLP, speech recognition, time series prediction, and other tasks where the input or output data has a temporal relationship.

Question 3

Q

How does an RNN work?

Answer

A

RNN can be unfolded over time, at each time step, the recurrent unit takes an input along with the hidden state from the previous time step, processes them, and produces an output and a new hidden state.

Question 4

Q

Advantages of RNN

Answer

A

1) Ability to Process Sequential Data
2) Memory over time
3) Flexibility in Input and Output Length
4) Summary of the sequence so far is available is encoded as context
5)

Question 5

Q

Disadvantages of RNN

Answer

A

1) Difficulty in Capturing Long-Term Dependencies due to vanishing gradient problem
3)Computational Complexity for long sequences and large-scale datasets, as they require processing each time step sequentially.
4) Vanishing and Exploding Gradients leads to instability in optimization
5) Fixed-Length Representations might restrict their ability to capture long-term dependencies in sequences

Question 6

Q

What is vanishing gradient problem?

Answer

A

The vanishing gradient problem occurs when the gradients of the loss function with respect to the parameters become very small as they propagate backward through the layers of the network during training.
When gradients become very small, the network parameters are updated slowly or not at all, leading to slow convergence or stagnation in learning.

Question 7

Q

What is exploding gradient and what might cause it?

Answer

A

Opposite of vanishing gradient problem.
This can happen due to exploding activations, unstable weight initialization, or high learning rates.

Question 8

Q

What happens as a result of exploding gradients?

Answer

A

Drastic updates in model parameters leading to oscillation and non-convergence.
Numerical-overflow

Question 9

Q

Mitigation strategies for Vanishing/Exploding Gradients?

Answer

A

Batch Normalisation, Proper Initialization, Reducing learning rate, Change your architecture, Gradient clipping(Exploding), Reducing model complexity

Question 10

Q

What is bidirectional RNNS?

Answer

A

Bi-RNNs consist of two RNNs, one processing the input sequence forward in time and the other backward.

Question 11

Q

What is LSTM?

Answer

A

Long Short Term Memory.

Question 12

Q

Why was LSTM introduced?

Answer

A

To address the vanishing gradient problem, and to capture long range dependencies

Question 13

Q

What is the idea of LSTM?

Answer

A

It addresses the vanishing gradient problem by introducing a memory cell with several gating mechanisms that control the flow of information

Question 14

Q

What are the components of an LSTM unit?

Answer

A

Cell State and Hidden state
Gates:
Input Gates: controls the flow of information into the memory cell.
Forget Gate: flow of information out of the memory cell
Output gate: the flow of information out of the LSTM and into the output.

Question 15

Q

Describe how gates work?

Answer

A

The input gate decides which information to store in the memory cell. It is trained to open when the input is important and close when it is not.

The forget gate decides which information to discard from the memory cell. It is trained to open when the information is no longer important and close when it is.

Question 16

Q

What are GRUS?

Answer

Study These Flashcards

A

Simplified version of LSTMS
Update Gate: Controls the flow of information from the previous hidden state to the current hidden state.
Reset Gate: Determines how much of the previous hidden state to forget.
Hidden State Update: A mathematical operation that combines the previous hidden state with the current input and produces the new hidden state.

Question 17

Q

Difference between LSTM and GRU

Answer

Study These Flashcards

A

LSTM has three gates on the other hand GRU has only two gates. In LSTM they are the Input gate, Forget gate, and Output gate. Whereas in GRU we have a Reset gate and Update gate.

In LSTM we have two states Cell state or Long term memory and Hidden state also known as Short term memory. In the case of GRU, there is only one state i.e Hidden state (Ht).

Question 18

Q

Answer

Study These Flashcards

A

RNNS Flashcards

(18 cards)