Generative AI Flashcards

1
Q

RLHF

A

Reinforcement learning from human feedback

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

PEFT

A

Parameter efficient fine tuning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Self-Attention

A
  • In order to predict the next word accurately, models need to be able to see the whole sentence or whole document
  • The transformer architecture unlocked this ability
  • Able to pay attention to the meaning of words it’s processing and attention is all you need
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Multi-headed Self-Attention

A

-Multiple sets of self-attention weights or heads are learned in parallel independently of each other
- The outputs of the multi-headed attention layers are fed through a feed-forward network to the output of the encoder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How many parameters does a model with general knowledge about the world have?

A

Hundreds of billions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How many parameters do you need for a single task like summarizing dialog or acting as a customer service agent for a single company?

A

Often just 500-1,000 examples can result in good performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Context window

A

Space available for the prompt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Inference

A
  • Generating a prediction
  • For LLMs, that would be using the model to generate text
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Completion

A

Output of the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Entity recognition

A

Word classification to identify all the people and places

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Foundational models by decreasing number of parameters

A

Bloom -> GPT -> Flan-T5 -> LLaMa -> PaLM -> BERT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

RNN

A
  • Recurrent neural networks
  • Used by previous generations of language models
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What’s so important about the transformer architecture?

A
  • The ability to learn the relevance and context of all the words in a sentence
  • It can be scaled efficiently to use multi-core GPUs
  • It can parallel process input data making use of much larger training datasets
  • Dramatically improved the performance of natural language tasks over earlier generation of RNNs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Instruction Fine Tuning

A

Adapting pre-trained models to specific tasks and datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

RAG

A

Retrieval Augmented Generation

Knowledge base data is used for the retrieval portion of the solution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What’s significant about the transformer architecture

A
  • Can be scaled efficiently to use multi-core GPUs
  • Parallel process input data making use of much larger training datasets
  • Dramatically improved the
16
Q

Origin of the Transformer Architecture

A

Attention Is All You Need

17
Q

What are attention weights?

A

The model learns the relevance of each word to all other words during training

18
Q

What are the two distinct parts of the transformer architecture

A

Encoder and decoder

19
Q

Tokenize

A
  • Convert words to numbers with each representing a position in a dictionary of all possible words
  • There are multiple tokenization methods. Token IDS can match two complete words or parts of words
  • The same tokenizer used to train the model must be used to generate the text
20
Q

Embedding Layer

A
  • Trainable vector embedding space
  • High-D space where each token is represented as a vector and occupies a unique location within that space
  • Each token id in the vocabulary is matched to a multi-dimensional vector
  • During model training, the vectors learn to encode the meaning and context of individual tokens in the input sequence
21
Q

What was the vector size in Attention Is All You Need

A

512 dimensions

22
Q

Positional encoding

A

Position of word in sentence/document

23
Q

What is passed into the encoder/decoder

A
  • Token vectors and positional encoding
  • Processed in parallel
24
Q

What is passed to the self-attention layer

A
  • The combined positional encodings and token vectors (single vector)
  • The model analyzes the relationships between the tokens in your input sequence
25
Q

GPT

A

Generative Pre-trained Transformers

26
Q

Limitations of RNNs

A
  • Limited by the amount of compute and memory needed to perform well on generative AI tasks
  • Next word prediction only
  • No self-attention
  • Not possible to scale the models from next word prediction to self attention
27
Q

Heads

A

Attention weights

28
Q

Feed Forward Neural Network

A
  • Information moves only in one direction, from the input layer through any hidden layers and finally to the output layer
  • There are no cycles or loops in the network
  • connections between the units do not form a cycle, unlike in RNN
29
Q

ReAct Prompting

A
  • Reasoning + Acting
  • LLMs are used to generate both reasoning traces and task-specific actions in an interleaved manner
  • Reasoning models allow the model to induce, track, update action plans, and handle exceptions. That allows the LLM to interact with external tools to retrieve additional information
30
Q

LangChain

A
  • provides AI developers with tools to connect language models with external data sources
  • prompt templates