Optimizing Foundation Models Flashcards by A w

The most common algorithms used to perform similarity searchs for vector storing

1) k-nearest neighbors(k-NN) or cosine similarity

How well did you know this?

Not at all

Perfectly

AWS vector database options

1) Amazon OpenSearch Service (provisioned)
2) Amazon OpenSearch Serverless
3) pgvector extension in Amazon Relational Database Service (Amazon RDS) for PostgreSQL
4) pgvector extension in Amazon Aurora PostgreSQL-Compatible Edition
5) Amazon Kendra

How well did you know this?

Not at all

Perfectly

What is RAG

Retrieval Augmented Generation. Allows you to customize a model’s response with new or up to date information but it is not a model customization method ( this involves changing the weights of the model)

How well did you know this?

Not at all

Perfectly

What are agents? What are the types

Agents are autonomous computer programs that perform tasks. Main types are 1) intermediary operation agents 2) Actions launch agents 3) feedback agents

How well did you know this?

Not at all

Perfectly

What is the quantitative evaluation of generative AI model?

Benchmarking datasets. Examples of measurements include accuracy, speed and efficiency and scalability

How well did you know this?

Not at all

Perfectly

Instruction tuning

A type of fine tuning that involves retraining the model on a new dataset that consists of prompts followed by the desired output (highly effective for interactive applications like virtual assistants and chatbots)

How well did you know this?

Not at all

Perfectly

Reinforcement learning from human feedback (RLHF)

fine-tuning technique
model is initially trained using supervised learning to predict human-like responses. Then, it is further refined through a reinforcement learning process, where a reward model built from human feedback guides the model toward generating more preferable outputs. Good for sensitive applications.

How well did you know this?

Not at all

Perfectly

Adapting models for specific domains

This approach involves fine-tuning the model on a corpus of text or data that is specific to a particular industry or sector.

How well did you know this?

Not at all

Perfectly

Transfer learning

This approach is a method where a model developed for one task is reused as the starting point for a model on a second task (highly efficient in using learned features and knowledge from the general training phase and applying them to a narrower scope with less additional training required.)

How well did you know this?

Not at all

Perfectly

Continuous pretraining

Pre-training model by continuously feeding it new and emerging data.

How well did you know this?

Not at all

Perfectly

Key steps in fine-tuning

1) Data curation (this involves a more rigorous selection process to ensure every piece of data is highly relevant.) 2) Labeling 3) Governance and compliance 4) Representativeness and bias checking 5) Feedback integration

How well did you know this?

Not at all

Perfectly

ROUGE

ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a set of metrics used to evaluate automatic summarization of texts, in addition to machine translation quality in NLP. The main idea behind ROUGE is to count the number of overlapping units between the model and human text

How well did you know this?

Not at all

Perfectly

ROUGE-L

This metric uses the longest common subsequence between the generated text and the reference texts. It is particularly good at evaluating the coherence and order of the narrative in the output

How well did you know this?

Not at all

Perfectly

ROUGE-N

This metric measures the overlap of n-grams between the generated text and the reference text. For example, ROUGE-1 refers to the overlap of unigrams, ROUGE-2 refers to bigrams, and so on. This metric primarily assesses the fluency of the text and the extent to which it includes key ideas from the reference.

How well did you know this?

Not at all

Perfectly

BLEU

a precision metric used to evaluate the quality of text that has been machine-translated from one natural language to another. BLEU measures the precision of N-grams in the machine-generated text that appears in the reference texts and applies a penalty for overly short translations (brevity penalty).

How well did you know this?

Not at all

Perfectly

The BERTScore

Study These Flashcards

BERTScore evaluates the semantic similarity rather than relying on exact lexical matches, it is capable of capturing meaning in a more nuanced manner. BERTScore uses the pretrained contextual embeddings from models like BERT to evaluate the quality of text-generation tasks. BERTScore computes the cosine similarity between the contextual embeddings of words in the candidate and the reference texts

Hyperparameter tuning

Study These Flashcards

A method to adjust the behavior of an ML algorithm. You can make changes to an ML model by using hyperparameter tuning to modify the behavior of the algorithm.

Benchmark dataset

Study These Flashcards

Curated collection of data designed specifically to evaluate the performance of a language model. Can be low cost way to evaluate bias

Amazon RAG Vector databases

Study These Flashcards

1) Open Search Service - scalable index management & fastest KNN (nearest neighbor search capabilities)
2) Amazon DocumentDB - No sequal. Millions of vectors
3) Amazon Aurora (relational database)
4) Amazon TDS (relational database)
5) Amazon Neptune (graph database)

Model improvement technique costs - from least to most

Study These Flashcards

1) Prompt Engineering (very cheap cause no additional computing or fine tuning)
2) Retrieval Augmented Generation RAG - (cheap bc don’t have to re-computer or fine tune but does require a vector database)
3) Instruction based fine tuning - (fine tuned with specific instructions - required additional computations)
4) Domain adoption fine-tuning (model trained on domain-specific, unlabeled data - intensive computation required).

What’s main driver of cost in Amazon Bedrock?

Study These Flashcards

Number of input and output tokens
Side note: batch costs can provide up to 50% savings

What’s main driver of cost in Amazon Bedrock?

Study These Flashcards

Number of input and output tokens
Side note: batch costs can provide up to 50% savings

Shapely value

Study These Flashcards

Provides a local explanation by quantifying the contribution of each feature to the predictions

Partial Dependence Plots (PDP)

Study These Flashcards

Provides global view of models behavior by illustrating how the predicted outcome changes as a single feature is varied

Partial Dependence Plots (PDP)

Provides global view of models behavior by illustrating how the predicted outcome changes as a single feature is varied

What is Sagemaker Clarify?

SageMaker Clarify is specifically designed to help identify and mitigate bias in machine learning models and datasets. It provides tools to analyze both data and model predictions to detect potential bias, generate reports, and help ensure that models are fair and transparent. It can help identify and measure bias within the data preparation stage and throughout the model's lifecycle.

SageMaker Model Monitor

tracks performance metrics, data drift, and other factors that affect model accuracy and reliability

What can you use to establish a private connection between your Amazon Bedrock foundational model and your Amazon Virtual Private Cloud (Amazon VPC)?

AWS PrivateLink to not expose your traffic to the Internet.

Continued Pre-training

- provide unlabeled data to a foundation model by familiarizing it with certain types of inputs. - You can provide data from specific topics to expose a model to those areas. - tweaks the model parameters to accommodate the input data and improve its domain knowledge.

Optimizing Foundation Models Flashcards

(29 cards)