RNNS Flashcards

1
Q

What are Recurrent Neural Networks?

A

RNNS are neural networks designed for processing sequential data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Application areas for RNN?

A

RNNs are widely used in NLP, speech recognition, time series prediction, and other tasks where the input or output data has a temporal relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does an RNN work?

A

RNN can be unfolded over time, at each time step, the recurrent unit takes an input along with the hidden state from the previous time step, processes them, and produces an output and a new hidden state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Advantages of RNN

A

1) Ability to Process Sequential Data
2) Memory over time
3) Flexibility in Input and Output Length
4) Summary of the sequence so far is available is encoded as context
5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Disadvantages of RNN

A

1) Difficulty in Capturing Long-Term Dependencies due to vanishing gradient problem
3)Computational Complexity for long sequences and large-scale datasets, as they require processing each time step sequentially.
4) Vanishing and Exploding Gradients leads to instability in optimization
5) Fixed-Length Representations might restrict their ability to capture long-term dependencies in sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is vanishing gradient problem?

A

The vanishing gradient problem occurs when the gradients of the loss function with respect to the parameters become very small as they propagate backward through the layers of the network during training.
When gradients become very small, the network parameters are updated slowly or not at all, leading to slow convergence or stagnation in learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is exploding gradient and what might cause it?

A

Opposite of vanishing gradient problem.
This can happen due to exploding activations, unstable weight initialization, or high learning rates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What happens as a result of exploding gradients?

A

Drastic updates in model parameters leading to oscillation and non-convergence.
Numerical-overflow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Mitigation strategies for Vanishing/Exploding Gradients?

A

Batch Normalisation, Proper Initialization, Reducing learning rate, Change your architecture, Gradient clipping(Exploding), Reducing model complexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is bidirectional RNNS?

A

Bi-RNNs consist of two RNNs, one processing the input sequence forward in time and the other backward.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is LSTM?

A

Long Short Term Memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why was LSTM introduced?

A

To address the vanishing gradient problem, and to capture long range dependencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the idea of LSTM?

A

It addresses the vanishing gradient problem by introducing a memory cell with several gating mechanisms that control the flow of information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the components of an LSTM unit?

A

Cell State and Hidden state
Gates:
Input Gates: controls the flow of information into the memory cell.
Forget Gate: flow of information out of the memory cell
Output gate: the flow of information out of the LSTM and into the output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe how gates work?

A

The input gate decides which information to store in the memory cell. It is trained to open when the input is important and close when it is not.

The forget gate decides which information to discard from the memory cell. It is trained to open when the information is no longer important and close when it is.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are GRUS?

A

Simplified version of LSTMS
Update Gate: Controls the flow of information from the previous hidden state to the current hidden state.
Reset Gate: Determines how much of the previous hidden state to forget.
Hidden State Update: A mathematical operation that combines the previous hidden state with the current input and produces the new hidden state.

17
Q

Difference between LSTM and GRU

A

LSTM has three gates on the other hand GRU has only two gates. In LSTM they are the Input gate, Forget gate, and Output gate. Whereas in GRU we have a Reset gate and Update gate.

In LSTM we have two states Cell state or Long term memory and Hidden state also known as Short term memory. In the case of GRU, there is only one state i.e Hidden state (Ht).

18
Q
A