Optimizing Foundation Models Flashcards

1
Q

The most common algorithms used to perform similarity searchs for vector storing

A

1) k-nearest neighbors(k-NN) or cosine similarity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

AWS vector database options

A

1) Amazon OpenSearch Service (provisioned)
2) Amazon OpenSearch Serverless
3) pgvector extension in Amazon Relational Database Service (Amazon RDS) for PostgreSQL
4) pgvector extension in Amazon Aurora PostgreSQL-Compatible Edition
5) Amazon Kendra

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is RAG

A

Retrieval Augmented Generation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are agents? What are the types

A

Agents are autonomous computer programs that perform tasks. Main types are 1) intermediary operation agents 2) Actions launch agents 3) feedback agents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the quantitative evaluation of generative AI model?

A

Benchmarking datasets. Examples of measurements include accuracy, speed and efficiency and scalability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Instruction tuning

A

A type of fine tuning that involves retraining the model on a new dataset that consists of prompts followed by the desired output (highly effective for interactive applications like virtual assistants and chatbots)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Reinforcement learning from human feedback (RLHF)

A
  • fine-tuning technique
  • model is initially trained using supervised learning to predict human-like responses. Then, it is further refined through a reinforcement learning process, where a reward model built from human feedback guides the model toward generating more preferable outputs. Good for sensitive applications.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Adapting models for specific domains

A

This approach involves fine-tuning the model on a corpus of text or data that is specific to a particular industry or sector.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Transfer learning

A

This approach is a method where a model developed for one task is reused as the starting point for a model on a second task (highly efficient in using learned features and knowledge from the general training phase and applying them to a narrower scope with less additional training required.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Continuous pretraining

A

Pre-training model by continuously feeding it new and emerging data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Key steps in fine-tuning

A

1) Data curation (this involves a more rigorous selection process to ensure every piece of data is highly relevant.) 2) Labeling 3) Governance and compliance 4) Representativeness and bias checking 5) Feedback integration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ROUGE

A

ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a set of metrics used to evaluate automatic summarization of texts, in addition to machine translation quality in NLP. The main idea behind ROUGE is to count the number of overlapping units between the model and human text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

ROUGE-L

A

This metric uses the longest common subsequence between the generated text and the reference texts. It is particularly good at evaluating the coherence and order of the narrative in the output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

ROUGE-N

A

This metric measures the overlap of n-grams between the generated text and the reference text. For example, ROUGE-1 refers to the overlap of unigrams, ROUGE-2 refers to bigrams, and so on. This metric primarily assesses the fluency of the text and the extent to which it includes key ideas from the reference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

BLEU

A

a precision metric used to evaluate the quality of text that has been machine-translated from one natural language to another. BLEU measures the precision of N-grams in the machine-generated text that appears in the reference texts and applies a penalty for overly short translations (brevity penalty).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The BERTScore

A

BERTScore evaluates the semantic similarity rather than relying on exact lexical matches, it is capable of capturing meaning in a more nuanced manner. BERTScore uses the pretrained contextual embeddings from models like BERT to evaluate the quality of text-generation tasks. BERTScore computes the cosine similarity between the contextual embeddings of words in the candidate and the reference texts

17
Q

Hyperparameter tuning

A

A method to adjust the behavior of an ML algorithm. You can make changes to an ML model by using hyperparameter tuning to modify the behavior of the algorithm.

18
Q

Benchmark dataset

A

Curated collection of data designed specifically to evaluate the performance of a language model. Can be low cost way to evaluate bias

19
Q

Amazon RAG Vector databases

A

1) Open Search Service - scalable index management & fastest KNN (nearest neighbor search capabilities)
2) Amazon DocumentDB - No sequal. Millions of vectors
3) Amazon Aurora (relational database)
4) Amazon TDS (relational database)
5) Amazon Neptune (graph database)

20
Q

Model improvement technique costs - from least to most

A

1) Prompt Engineering (very cheap cause no additional computing or fine tuning)
2) Retrieval Augmented Generation RAG - (cheap bc don’t have to re-computer or fine tune but does require a vector database)
3) Instruction based fine tuning - (fine tuned with specific instructions - required additional computations)
4) Domain adoption fine-tuning (model trained on domain-specific, unlabeled data - intensive computation required).

21
Q

What’s main driver of cost in Amazon Bedrock?

A

Number of input and output tokens
Side note: batch costs can provide up to 50% savings

22
Q

What’s main driver of cost in Amazon Bedrock?

A

Number of input and output tokens
Side note: batch costs can provide up to 50% savings

23
Q

Shapely value

A

Provides a local explanation by quantifying the contribution of each feature to the predictions

24
Q

Partial Dependence Plots (PDP)

A

Provides global view of models behavior by illustrating how the predicted outcome changes as a single feature is varied

25
Q

Partial Dependence Plots (PDP)

A

Provides global view of models behavior by illustrating how the predicted outcome changes as a single feature is varied