Handout #10 - Attention Mechanism and Transformer Model Flashcards

Question 1

Q

What’s the problem with RNN?

Answer

A

Question 2

Q

Why can’t CNN and DNN be used in text processing?

Answer

A

The dependencies between different words aren’t between the current and previous context sample

-> the verb is not always the next word after the subject.

Question 3

Q

What’s cross-attention?

Answer

A

CA allows you to work with multiple modailities (e.g. audio, video, images, text)

-> works because it doesn’t depend to the position of the keys/values and can deal with any synchronisation issues.

Question 4

Q

Give the layman’s definition of the Transformer

Answer

A

It’s an encoder/decoder network that is solely based on sequences of Attention Layer blocks

(4 cards)