Deep Sequence Modelling Flashcards

Question 1

Q

What is Sequence Model?

SM split data into …

Answer

A

SM split data into small chunks / sequences of data to solve classification problems

Question 2

Q

What are the 4 problems SM solve?

Answer

A

one-to-one - binary class
many-to-one - sentiment class
one-to-many - Image caption
many-to-many - Machine translation

Question 3

Q

Within SM, what is neurons with recurrence?

Neurons wif Recurrence is the computation of … curr input and previous output …

Answer

A

Neurons with recurrence is the computation at each time step of the product of current input and the output of previous time step(Past memory).

Question 4

Q

How does RNNs work?

Applys a ___ relation at every ___ to process a sequence.

Answer

A

It apply a recurrence relation at every time step to process a sequence:
ht = fw(xt, ht-1)

Question 5

Q

What is RNNs intuition?

Give the idea of process

Answer

A

Input Weight vector -> Update hidden state -> Output vector/Pred output

Question 6

Q

What is the computation of RNN across time?

__ weight matrices
__ across time step
When forwardpropa, compute __ with backpropa
Sum total of __ across all sequences

Answer

A

Reuse same weight matrices.
Re-update across Time Step.
When forwardpropa, compute loss with backpropagation.
Sum total of loss across all sequences.

Question 7

Q

What are the 4 Design Criteria for SM?

Handle __ length sequence
Track ____ dependencies
Maintain info about __
Share __ across the seq

Answer

A

Handle variable length sequence
Track long-term dependencies
Maintain info about order
Share parameters across the sequence

Question 8

Q

What is the technique called that transform language into indexes?

Give 1 word

Answer

A

Embedding / Encoding

Question 9

Q

What are the 4 criteria to model sequences?

RNNs meet these criterias

Handle __ seq
Track __ dependencies
Maintain infor about __
Share __ across the seq

Answer

A

Handle variable-length sequences
Track long-term dependecies
Maintain information about order
Share parameters across the sequence

Question 10

Q

The standard RNN gradient flow consists of repeated computation of weight matrices. What are the 2 issues with this?

Large values cause __ gradients
Small values cause __ gradients

Answer

A

Very large values will cause exploding gradients.
Small values will cause vanishing gradients.

Question 11

Q

Why is vanishing gradients a big problem?

It causes the model to…

Answer

A

It causes the model to lose the ability to learn something useful.

Question 12

Q

What is the most robust way to mitigate vanishing gradients?

___ cells: Use __ to add or remove info.

Answer

A

Gated cells: Use gates to selectively add or remove info within each recurrent unit.

Question 13

Q

What is the Long short-term memory (LSTM) key concept?

__ & __ information

Answer

A

Forget & Store information

Question 14

Q

How to build a more effective Sequence Model?

Use __ to model seq without recurrence.

Answer

A

Use self-attention to model sequence without recurrence.

Question 15

Q

What is the architecture of Transformer in AI?

Answer

A

Self-attention is the foundation mechanism build in NN of Transformer.

Question 16

Q

What are the 4 steps of self-attention with NN?

Encode __ info
Extract __, __, __ (Q, K, V)
Compute __ weighting
Extracts features with __

Answer

A

Encode position info
Extract Query, Key, Value
Compute attention weighting
Extracts features with high attention