11 - LLMs Flashcards

1
Q

WHat is attention?

A

Transformers learn which words are important (associated) to each other word

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Attention Is All You Need

A

The 2017 Google Brain Paper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do neural nets process words? Word Embedding

A

Transform words into vectors for feeding into neural networks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Common Word Embeddings

A

Word2Vec, Glove and FastText

Use between 50 and 300 values to represent each word.

Similar words, similar vectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is word embedding useful?

A

Two similar words with similar meaning have a similar vector.

A network performing sentiment analysis will find it easier to learn that sentences such as “I’m happy” or “I’m cheerful” have similar sentiments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Multi-head attention

A

Input is processed using three different learned matrices, W-Query, W-Keys and W-Values

Only contains linear operations;

Feedforward with non-linear activation is required.

Multiple versions allow for different attention mapping

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Encoder part

A

Understands and extracts relevant info from the input. Outputs a continuous representation (embedding)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

LLM Evaluation metrics

A
  • ROUGE
    Used for text summarisation and compares a summary to one or more reference summaries.
  • BLEU score
    Used for text translation and compares to human generated translations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

LoRA

A

Low Rank Adaptation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

LoRA: Main Idea
(Hint: freeze, parameters, weights)

A
  • freeze the weights of the self-attention module
  • add task-specific knowledge using a small set of tunable parameters
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

LoRA Steps

A
  1. Freeze most of the original LLM weights
  2. insert 2 rank decomposition matrices
  3. Train the weights of the smaller matrices

Steps to update model for inference:
1. Matrix multiply the low rank matrices
2. Add to the original weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Soft Prompts

A

Like word encoding but instead of words, are vectors that are tuned to improve model performance.

Same length as token vectors and typically 20-100 tokens

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Soft Prompts with LoRA

A

We can have different sets of prompts for different tasks.

(Switch out soft prompt at inference time to change task)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Vanilla FIne TUning

A

Select a subset of parameters to tune and leave the rest frozen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Reparametrisation

A

Add knowledge with additional parameters as in LoRA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Additive fine tuning

A

Add information to the prompt with soft prompts that have been tuned for specific tasks

17
Q

Overall name for parameter fine tuning

A

PEFT - Paramter Efficient FIne Tuning

18
Q

Prompt Engineering

A

DIscovering the best format, keywords and structure sentence of the queries to produce best output

19
Q

LLM complete development and deployment pipeline

A

Understanding the software frameworks and all the steps required to create and deply an LLM for a specific application

20
Q

Reinforcement Learning with Human Feedback

A

Understanding how to learn models that can assess the output and use those models to fine-tune LLMS using reinforcement learning.