Recurrent Neural Networks Flashcards

1
Q

What is a recurrent neural network (RNN)?

A

An RNN is a class of neural network where a ‘loop’ within the network allows data to persist, or be ‘saved within’ itself, allowing that data to be used to evaluate the next input in the sequence.

At any time t, the cell has an input x(t) and an output y(t). Part of y(t) - notated by h(t) - is fed back into the cell for use in evaluating x(t + 1).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are recurrent neural networks (RNNs) used for?

A

They are extremely good at dealing with temporal data - i.e. sports results - as it can remember previous evaluations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the process of padding sequences?

A

Padding sequences is used to ensure all sequences have the same length, which helps when dealing with NNs that expect fixed-size inputs.

e.g. [1,2], [3,4,5] -> [1,2,0], [3,4,5]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the process of word embedding?

A

Word embedding transforms the indices into dense vectors where each dimension encodes information about how the word relates to others in the given context.

i.e. “Hope to see you soon!” -> [1,2,3,4,5,6]
“Nice to see you again!” -> [7,2,3,4,8,6]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the main problem with N-grams in word prediction?

A

Only prior local context, by the Markov assumption, affects the next word.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can we use an RNN to perform word prediction?

A

Tokenize the text into words or subwords. Then, create input-output sequences where the input is a sequence, and the output is the next word in that sequence.

Use a set of RNN layers to capture sequential dependencies. The more RNN layers used, the wider the local context.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the main issue with RNNs?

A

RNNs are susceptible to the vanishing gradient problem, as for long sequences, the gradients of weights can reach zero, making it impossible to learn.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Long Short-Term Memory (LSTM)?

A

LSTM networks are a class of RNN that uses special units called ‘memory cells’ that can maintain information in memory for long periods of time using ‘gates’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the input gate of an LSTM network?

A

Controls the intake of new information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the forget gate of an LSTM network?

A

Determines what part of the cell state to be updated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the output gate of an LSTM network?

A

Determines what part of the cell state to output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly