Transformers: The attention Mechanism From Scratch Flashcards

1
Q

Step 1

A

encoder representations of four different words

Defining or obtaining the word embeddings

word_1 = array([1, 0, 0])
word_2 = array([0, 1, 0])
word_3 = array([1, 1, 0])
word_4 = array([0, 0, 1])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Step 2

A


# generating the weight matrices
random.seed(42) # to allow us to reproduce the same attention values
W_Q = random.randint(3, size=(3, 3))
W_K = random.randint(3, size=(3, 3))
W_V = random.randint(3, size=(3, 3))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly