P3 - Architecture & Machine Learning Models Flashcards

Question 1

Q

What Are Neural Networks in NLP?

Answer

A

Neural networks process sequential data in NLP to understand language.
e.g.: translation, text generation, and chatbot responses.

Question 2

Q

How do NNs work?

Answer

A

Nodes in one layer are connected to the next.
The strength of each connection varies - this is called the weight.
NN learns by adjusting the weights of the connections.
(Some NNs may have billions of input nodes (parameters) and hundreds of hidden layers.)

Question 3

Q

What is the input of a NN?

Answer

A

Each word is turned into a multidimensional vector first, using a word embedding algorithm.
This groups words together with other similar words.
e.g.: Word2Vec was created at Google in 2013 to help with search.
Alternative algorithms:
- GloVe (Global Vectors), Stanford 2014
- fastText, Facebook 2015

Question 4

Q

How are the relationships between words represented?

Answer

A

Vectors capture the relationships between words.
Similar vectors represent an equivalent relationship between words.

Question 5

Q

How is Word2Vec trained using written text?

Answer

A

Each word is compared to those typically found close to it in ordinary texts.
- Continuous Bag of Words (CBOW):
Tries to predict the “central” word in a phrase by looking at those nearby.
- Skip-gram:
Does the opposite - starts with the central word and predicts those likely to be before or after.
Both algorithms are used simultaneously when training the Word2Vec neural network.

Question 6

Q

Simplified neural network process

Answer

A

An input text is split into tokens (words or parts of words)
A word embedding algorithm converts the tokens into vectors
The vectors are passed to the neural network.

Question 7

Q

What is the purpose of comparing the output vector to the correct vector in training a neural network?

Answer

A

It allows calculation of the error value, which is used for backpropagation to adjust the weights of the layers.

Question 8

Q

How does backpropagation help in training a neural network?

Answer

A

It adjusts the weights of the layers based on the error value, improving accuracy over time.

Question 9

Q

Why is training a neural network done repeatedly with large amounts of data?

Answer

A

Repeated training refines the weights, improving the model’s accuracy in predicting correct outputs.

Question 10

Q

What problem does an Recurrent Neural Network solve with standard NN?

Answer

A

It processes all input data simultaneously, losing the order of words and outputting an aggregate result of all input vectors. RNNs preserve the order of words by processing them one at a time and feeding the result of the last inner layer back to the first, creating a form of memory.

Question 11

Q

Why can’t you use standard backpropagation in an RNN?

Answer

A

Because of the feedback process, you must use backpropagation through time (BPTT) to retrace each step where the output was fed back into the hidden layers.

Question 12

Q

What is the vanishing gradient problem in RNNs?

Answer

A

The influence of earlier words in a sequence diminishes over time, making it difficult for RNNs to learn long-term dependencies.

Question 13

Q

Why do RNNs struggle with long input sequences?

Answer

A

Due to the vanishing gradient problem, earlier words in a sequence have a much smaller effect on learning compared to later ones.

Question 14

Q

What is an LSTM network?

Answer

A

A type of RNN designed to handle long-term dependencies using memory cells and gating mechanisms.

Question 15

Q

What are the three gates in an LSTM?

Answer

A

Forget gate – Discards irrelevant or outdated information.
Input gate – Incorporates new information at each time step.
Output gate – Passes part of the updated cell state to the next layer.

Question 16

Q

How do LSTMs improve upon standard RNNs?

Answer

Study These Flashcards

A

Better at retaining important information over longer sequences.
Reduce the impact of vanishing gradients.

Question 17

Q

Why do LSTMs help prevent the vanishing gradient problem?

Answer

Study These Flashcards

A

The memory cell allows information to bypass repetitive multiplication, preventing the exponential shrinking of gradients.

Question 18

Q

What are some limitations of LSTMs?

Answer

Study These Flashcards

A

The memory cell allows information to bypass repetitive multiplication, preventing the exponential shrinking of gradients.

Question 19

Q

What are some limitations of LSTMs?

Answer

Study These Flashcards

A

Still process one token at a time, making them inefficient.
More computationally expensive than RNNs.
Require fixed-length sequences, making padding inefficient and splitting disruptive.
Struggle with handling global context (distant token relationships).

Question 20

Q

When are LSTMs useful?

Answer

Study These Flashcards

A

When sequence lengths are moderate (e.g., speech recognition).
When computational resources are limited (e.g., edge computing).
For domain-specific models optimised for LSTMs

P3 - Architecture & Machine Learning Models Flashcards

(20 cards)