DL-07 - Sequence models Flashcards

1
Q

DL-07 - Sequence models

What is a sequence model?

A

A model that handles sequential data, where the order of the data matters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

DL-07 - Sequence models

What is another name for sequence models?

A

seq2seq

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

DL-07 - Sequence models

What is another name for seq2seq?

A

Sequence models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

DL-07 - Sequence models

What is the definition of a sequence model?

A

a ML model where input or output is a sequence of data (e.g., text data, audio data, time series data).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

DL-07 - Sequence models

What are the different types of sequence models called (abstract)? (4)

A
  • One to one
  • one to many
  • many to one
  • many to many
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

DL-07 - Sequence models

Describe what a one to one model looks like.

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

DL-07 - Sequence models

Describe what a one to many model looks like.

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DL-07 - Sequence models

Describe what a many to one model looks like.

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

DL-07 - Sequence models

Describe what a many to many model looks like.

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

DL-07 - Sequence models

What is a name entity recognition task?

A

E.g. determine what words in a sentence is an entity. Generally names of things.

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

DL-07 - Sequence models

What task is this an example of? (See image)

A

Entity recognition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

DL-07 - Sequence models

What is sentiment analysis?

A

Predict the sentiment of some input, e.g. positive or negative. (See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DL-07 - Sequence models

What task is this an example of? (See image)

A

Sentiment analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

DL-07 - Sequence models

What is activity recognition?

A

A task where you label the activity in e.g. an image or a video. (See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

DL-07 - Sequence models

What task is this? (See image)

A

Activity recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

DL-07 - Sequence models

What are some popular sequence models? (3)

A
  • RNN
  • LSTM
  • Transformers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

DL-07 - Sequence models

What is the main idea behind RNNs?

A

RNNs process sequential data by maintaining an internal state and iteratively updating it with each input in the sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

DL-07 - Sequence models

What model can you think of as a sequence of neural networks that are trained one after another?

A

RNN (and LSTM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

DL-07 - Sequence models

Describe how we typically draw RNNs.

A

(See image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

DL-07 - Sequence models

What model is depicted?

A

RNN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

DL-07 - Sequence models

Describe what X, t, h and y are in the image. (See image)

A
  • t is the time step
  • x are inputs
  • h are hidden states
  • y is the predicted outputs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

DL-07 - Sequence models

What parameters does an RNN layer have?

A
  • Weights
  • Biases
  • Hidden state/recurrent weights (output at previous time step)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

DL-07 - Sequence models

In an RNN, what is T_x and T_y?

A

The number of inputs and the number of outputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

DL-07 - Sequence models

What is BPTT short for?

A

Backpropagation through time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

DL-07 - Sequence models

How does backpropagation through time work?

A

By unrolling the recurrent neural network through time and applying standard backpropagation to compute gradients for updating weights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

DL-07 - Sequence models

When is loss backpropagated in BPTT?

A

The loss is backpropagated from the last to the first time step that allows updating the weights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

DL-07 - Sequence models

What is NLP short for?

A

Natural language processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

DL-07 - Sequence models

What are the two steps of text sequence representation in Natural Language Processing (NLP)?

A

The two steps are:
- creation of vocabulary
- numeric representation of text/words.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

DL-07 - Sequence models

In NLP, what is a vocabulary?

A

A dictionary of unique words of interest (e.g. Norwegian or English words).

30
Q

DL-07 - Sequence models

In NLP, what do we call a dictionary of unique words of interest (e.g. Norwegian or English words)?

A

A vocabulary.

31
Q

DL-07 - Sequence models

How are sentences tokenized?

A

Generally word by word.
(Advanced: See subword tokenization)

32
Q

DL-07 - Sequence models

How do you create a vocabulary? (5)

A

Take your input data.

Perform:
- Remove punctuation
- Remove stop words
- Stem words (transporting -> transport)
- Add Start/end of sentence tokens
- Add other identifiers as necessary (<UNKOWN>, <DIGIT>).</DIGIT></UNKOWN>

Select unique words.

33
Q

DL-07 - Sequence models

What are the commonly used technique for text representation? (3)

A
  • One-hot encoding
  • Bag-of-words
  • Word embeddings
34
Q

DL-07 - Sequence models

What are these examples of in NLP?
- One-hot encoding
- Bag-of-words
- Word embeddings

A

Text representation techniques.

35
Q

DL-07 - Sequence models

How do you use one-hot-encoding to represent text in NLP?

A

Convert each unique character or word into a binary vector with a 1 at the position corresponding to that character or word and 0s elsewhere.

36
Q

DL-07 - Sequence models

What is the issue with one-hot encoding in NLP?

A

Curse of dimensionality!

37
Q

DL-07 - Sequence models

How can you solve the problem of curse of dimensionality with one-hot encoded data?

A

One-hot encoded vectors can be transformed to a lower dimensional space using an embedding technique.

38
Q

DL-07 - Sequence models

In NLP, what is bag-of-words?

A

Bag-of-words is a representation technique where a text is described by the frequency of its words, disregarding grammar and word order but maintaining multiplicity.

39
Q

DL-07 - Sequence models

What is BOW short for?

A

Bag-of-words representation.

40
Q

DL-07 - Sequence models

What does the bag-of-words (BOW) representation do with words in a text?

A

The bag-of-words representation puts words in a “bag” and scores them based on their counts or frequencies in the text.

41
Q

DL-07 - Sequence models

What could the BOW representation for this sentence look like?

input text: “I love AI. AI is cool”

A

BoW representation: [2, 1, 1, 1, 1 ] corresponding to the vocabulary: [AI, cool, I, is, love].

The vocabulary is a vector where the index corresponds to a particular word, and the number in that position is the times it occurred in the sentence.

42
Q

DL-07 - Sequence models

What are some problems with BOW word frequency?

A

highly frequent words in the document dominate (larger score), even if they do not contain as much “informational content” (E.g. words like I, The, a).

43
Q

DL-07 - Sequence models

What is TF-IDF short for?

A

Term Frequency-Inverse Document Frequency

44
Q

DL-07 - Sequence models

What does TF-IDF do?

A

It rescales the frequency of words by how often they appear in all documents.

45
Q

DL-07 - Sequence models

What is the formula for TF-IDF?

A

(See image)

46
Q

DL-07 - Sequence models

What are word embeddings in NLP?

A

Word embedding is a technique to map words or phrases to vectors of numerical values, of given size.

47
Q

DL-07 - Sequence models

What does the word embedding technique do? (2)

A
  • Maps words or phrases to vectors of numerical values, of given size.
  • Dimensionality reduction of word/sentences.
48
Q

DL-07 - Sequence models

What are some popular word embedding techniques? (3)

A
  • GloVe
  • Word2Vec
  • NN embedding layer
49
Q

DL-07 - Sequence models

What are GloVe and Word2Vec examples of?

A

Word embedding techniques (/models?).

50
Q

DL-07 - Sequence models

How does an NN embedding layer work?

A

(See image)

51
Q

DL-07 - Sequence models

What can happen to gradients in RNNs?

A

Vanishing or exploding gradients occur, causing the model to stop learning or take too long.

52
Q

DL-07 - Sequence models

What do traditional sequence models struggle with, in terms of relating to past information?

A

They cannot relate to the past beyond the immediate previous input.

53
Q

DL-07 - Sequence models

What is a solution for improving sequence models to better remember distant inputs?

A

Add memory and make efficient use of it, possibly by forgetting less relevant information.

54
Q

DL-07 - Sequence models

What are two improved Seq2Seq models that incorporate memory?

A
  • GRU (Gated Recurrent Unit)
  • LSTM (Long Short-Term Memory).
55
Q

DL-07 - Sequence models

What is GRU short for?

A

Gated Recurrent Unit

56
Q

DL-07 - Sequence models

What is LSTM short for?

A

Long Short-Term Memory

57
Q

DL-07 - Sequence models

Label the parts that are masked out.

A
  • Forget
  • Update
  • Input
  • Output (Result)
58
Q

DL-07 - Sequence models

What is the main purpose of LSTM networks in deep learning?

A

LSTM networks extend the memory of RNNs to learn from important experiences with long time steps in between.

59
Q

DL-07 - Sequence models

What is one advantage of using LSTM networks over traditional RNNs?

A

LSTM networks enable short-term memory to last for a longer time.

60
Q

DL-07 - Sequence models

What issue with sequence model training does LSTMS help mitigate?

A

LSTM networks help mitigate the problematic issue of vanishing gradients.

61
Q

DL-07 - Sequence models

What are the gates in an LSTM called? (4)

A
  • Input
  • Output
  • Update
  • Forget
62
Q

DL-07 - Sequence models

What is the purpose of the input gate in an LSTM?

A

The input gate determines how much of the new input should be added to the cell state.

63
Q

DL-07 - Sequence models

What is the purpose of the forget gate in an LSTM?

A

The forget gate decides what information to discard from the cell state.

64
Q

DL-07 - Sequence models

What is the purpose of the output gate in an LSTM?

A

The output gate selects which values from the updated cell state will be passed to the next hidden state.

65
Q

DL-07 - Sequence models

What is the purpose of the update gate in an LSTM?

A

The update gate computes candidate values to be added to the cell state, based on the current input and previous hidden state.

66
Q

DL-07 - Sequence models

Describe the LSTM model’s architecture.

A

(See image)

67
Q

DL-07 - Sequence models

What are the inputs of the LSTM cell called? (3)

A
  • Input
  • Hidden state
  • Cell state
68
Q

DL-07 - Sequence models

What are the outputs of the LSTM cell called? (3)

A
  • Hidden state
  • Cell state
  • Output
69
Q

DL-07 - Sequence models

What optimizers have worked well for text data with LSTM for text data? (2)

A
  • Adam
  • Adagrad
70
Q

DL-07 - Sequence models

What activation function and loss should you use for LSTM with text data?

A
  • Softmax (predict prob for word)
  • Cross-entropy loss
71
Q

DL-07 - Sequence models

What metrics would you use for LSTM with text data?

A

Accuracy, precision, recall. Think of outputs as the probability of outputting the correct word.

72
Q

DL-07 - Sequence models

What is a bidirectional RNN?

A

A bidirectional RNN is a type of recurrent neural network that processes input data in both forward and backward directions, capturing information from both past and future contexts.