Transformers:Self attention Flashcards
1
Q
The self attention operation takes n inputs and how many outputs?
A
n
2
Q
If the keys, queries, and values are generated from the same sequence, what type of attention so we have?
A
Self attention
https://medium.com/@angelina.yang/whats-the-difference-between-attention-and-self-attention-in-transformer-models-2846665880b6