Class Eleven Flashcards

Question 1

Q

What is Long Short-Term Memory (LSTM)?

Answer

A

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture that is specifically designed to address the vanishing gradient problem in traditional RNNs. It introduces memory cells and gating mechanisms to effectively capture long-term dependencies in sequential data.

Question 2

Q

What are the advantages of using LSTM?

Answer

A

Advantages of using LSTM include its ability to capture long-term dependencies, handle vanishing/exploding gradients, and mitigate the issues of information loss and gradient decay over long sequences.

Question 3

Q

How does an LSTM cell work?

Answer

A

An LSTM cell consists of a memory cell, an input gate, a forget gate, and an output gate. The input gate controls the flow of information into the memory cell, the forget gate controls the retention or removal of information from the cell, and the output gate controls the output of information from the cell.

Question 4

Q

What is the purpose of the memory cell in an LSTM?

Answer

A

The memory cell in an LSTM is responsible for storing and updating the information over time. It allows the network to selectively retain or forget information based on the input and gating mechanisms.

Question 5

Q

What is the role of the input gate in an LSTM?

Answer

A

The input gate in an LSTM determines how much of the new input should be stored in the memory cell. It selectively updates the cell state based on the input and the previous cell state.

Question 6

Q

What is the purpose of the forget gate in an LSTM?

Answer

A

The forget gate in an LSTM controls the amount of information retained in the memory cell. It decides which information from the previous cell state should be discarded based on the input and the previous hidden state.

Question 7

Q

What is the function of the output gate in an LSTM?

Answer

A

The output gate in an LSTM determines the amount of information that should be output from the memory cell. It applies a non-linear activation function to the current cell state and controls the flow of information to the next hidden state.

Question 8

Q

How does Gated Recurrent Unit (GRU) differ from LSTM?

Answer

A

Gated Recurrent Unit (GRU) is a variant of LSTM that also addresses the vanishing gradient problem but has a simpler architecture. GRU combines the input gate and the forget gate into a single update gate, reducing the number of parameters compared to LSTM.

Question 9

Q

What are the advantages of using GRU?

Answer

A

Advantages of using GRU include its simpler architecture, which results in fewer parameters and faster training compared to LSTM. GRU is particularly useful when dealing with less complex sequential data or when computational resources are limited.

Question 10

Q

How does the update gate in GRU work?

Answer

A

The update gate in GRU determines the amount of information that should be updated or discarded in the current time step. It combines the roles of the input gate and forget gate in LSTM.

Question 11

Q

What is the reset gate in GRU?

Answer

A

The reset gate in GRU controls how much of the previous hidden state should be forgotten or retained in the current time step. It allows the model to selectively reset or preserve the memory of past information.

Question 12

Q

How are LSTMs and GRUs trained?

Answer

A

LSTMs and GRUs are trained using backpropagation through time (BPTT), where the gradients are computed and used to update the model parameters. The training process involves optimizing a loss function by minimizing the prediction error.

Question 13

Q

In which domains or applications are LSTMs and GRUs commonly used?

Answer

A

LSTMs and GRUs are commonly used in natural language processing (NLP) tasks such as language translation, sentiment analysis, text generation, and speech recognition. They are also utilized in time series analysis, anomaly detection, and other sequence-based applications.

Question 14

Q

What are some limitations of LSTMs and GRUs?

Answer

A

Limitations of LSTMs and GRUs include the potential for overfitting, the requirement of large amounts of data for effective training, and the computational complexity, which can make them slower to train and deploy compared to simpler models.

Question 15

Q

How do LSTMs and GRUs help address the vanishing gradient problem?

Answer

A

LSTMs and GRUs address the vanishing gradient problem by using gating mechanisms that selectively retain or discard information over time. This allows them to capture and propagate gradients effectively over long sequences.

Question 16

Q

What is the difference between LSTMs and GRUs?

Answer

Study These Flashcards

A

The main difference between GRU and LSTM lies in their architecture and the number of gates they use.

LSTM has three main gates: the input gate, the forget gate, and the output gate. These gates control the flow of information within the LSTM unit. The input gate determines how much new information to let into the memory, the forget gate decides what information to discard from the memory, and the output gate regulates the amount of memory to be used for generating the output.

On the other hand, GRU has two main gates: the reset gate and the update gate. These gates serve similar functions as the gates in LSTM. The reset gate decides which parts of the previous memory to reset or ignore, while the update gate determines how much of the new information to incorporate into the memory.

In terms of complexity, LSTM is considered more complex than GRU due to its additional gate, which allows for more fine-grained control over the memory.

Question 17

Q

What is BLSTM?

Answer

Study These Flashcards

A

Bidirectional LSTM (BLSTM) is an extension of the LSTM model that processes the input sequence in both forward and backward directions. It captures dependencies from both past and future contexts, allowing the model to have a more comprehensive understanding of the input sequence.

Class Eleven Flashcards

(17 cards)