11 - LLMs Flashcards

Question 1

Q

WHat is attention?

Answer

A

Transformers learn which words are important (associated) to each other word

Question 2

Q

Attention Is All You Need

Answer

A

The 2017 Google Brain Paper

Question 3

Q

How do neural nets process words? Word Embedding

Answer

A

Transform words into vectors for feeding into neural networks

Question 4

Q

Common Word Embeddings

Answer

A

Word2Vec, Glove and FastText

Use between 50 and 300 values to represent each word.

Similar words, similar vectors

Question 5

Q

Why is word embedding useful?

Answer

A

Two similar words with similar meaning have a similar vector.

A network performing sentiment analysis will find it easier to learn that sentences such as “I’m happy” or “I’m cheerful” have similar sentiments

Question 6

Q

Multi-head attention

Answer

A

Input is processed using three different learned matrices, W-Query, W-Keys and W-Values

Only contains linear operations;

Feedforward with non-linear activation is required.

Multiple versions allow for different attention mapping

Question 7

Q

Encoder part

Answer

A

Understands and extracts relevant info from the input. Outputs a continuous representation (embedding)

Question 8

Q

LLM Evaluation metrics

Answer

A

ROUGE
Used for text summarisation and compares a summary to one or more reference summaries.
BLEU score
Used for text translation and compares to human generated translations

Question 9

Q

LoRA

Answer

A

Low Rank Adaptation

Question 10

Q

LoRA: Main Idea
(Hint: freeze, parameters, weights)

Answer

A

freeze the weights of the self-attention module
add task-specific knowledge using a small set of tunable parameters

Question 11

Q

LoRA Steps

Answer

A

Freeze most of the original LLM weights
insert 2 rank decomposition matrices
Train the weights of the smaller matrices

Steps to update model for inference:
1. Matrix multiply the low rank matrices
2. Add to the original weights

Question 12

Q

Soft Prompts

Answer

A

Like word encoding but instead of words, are vectors that are tuned to improve model performance.

Same length as token vectors and typically 20-100 tokens

Question 13

Q

Soft Prompts with LoRA

Answer

A

We can have different sets of prompts for different tasks.

(Switch out soft prompt at inference time to change task)

Question 14

Q

Vanilla FIne TUning

Answer

A

Select a subset of parameters to tune and leave the rest frozen

Question 15

Q

Reparametrisation

Answer

A

Add knowledge with additional parameters as in LoRA

Question 16

Q

Additive fine tuning

Answer

Study These Flashcards

A

Add information to the prompt with soft prompts that have been tuned for specific tasks

Question 17

Q

Overall name for parameter fine tuning

Answer

Study These Flashcards

A

PEFT - Paramter Efficient FIne Tuning

Question 18

Q

Prompt Engineering

Answer

Study These Flashcards

A

DIscovering the best format, keywords and structure sentence of the queries to produce best output

Question 19

Q

LLM complete development and deployment pipeline

Answer

Study These Flashcards

A

Understanding the software frameworks and all the steps required to create and deply an LLM for a specific application

Question 20

Q

Reinforcement Learning with Human Feedback

Answer

Study These Flashcards

A

Understanding how to learn models that can assess the output and use those models to fine-tune LLMS using reinforcement learning.

11 - LLMs Flashcards

(20 cards)