Transformers Flashcards

1
Q

Explain how self-attention helps solve the problem of understanding pronoun references in the example sentence ‘The dog ran so fast that it looked like a brown dot as it ran away.’

A

Self-attention helps the model focus on ‘the dog’ when processing ‘it,’ ensuring the pronoun is linked correctly to what ‘it’ refers to.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the main challenges with using RNNs for processing text?

A

RNNs have trouble with long sentences due to vanishing gradients, process data one step at a time (making them slow), and can’t take full advantage of parallel processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the benefits of using transformers for processing text?

A

Transformers process long sentences better, train faster with parallel computation, and use self-attention to focus on important words in context.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the role of encoders and decoders in the sequence-to-sequence architecture. How do they work together to process information?

A

The encoder converts input data into a summary (context), and the decoder uses that summary to create output, like translating a sentence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What role does scaling (in terms of data, parameters, and compute) play in transformer performance? Is continuous scaling a sustainable path forward?

A

Scaling improves transformer results by allowing them to learn more, but it may not be sustainable due to high costs and environmental impact.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The slides suggest that being ‘next word prediction machines’ might not be sufficient for human-like intelligence. What are the implications of this observation?

A

It means transformers may lack true understanding or reasoning, showing the need for models that can think more deeply like humans.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Transformers can do ‘in context learning.’ Explain what in context learning means. Provide an example if necessary.

A

In-context learning is when a transformer model performs a task based on examples or instructions provided in the input text, without the need of retraining it.
Example:
For a sentiment classification task:
Input:
“Classify the sentiment: ‘I love this movie!’ -> Positive”
Output:
Positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly