Optimizing Foundation Models Flashcards
The most common algorithms used to perform similarity searchs for vector storing
1) k-nearest neighbors(k-NN) or cosine similarity
AWS vector database options
1) Amazon OpenSearch Service (provisioned)
2) Amazon OpenSearch Serverless
3) pgvector extension in Amazon Relational Database Service (Amazon RDS) for PostgreSQL
4) pgvector extension in Amazon Aurora PostgreSQL-Compatible Edition
5) Amazon Kendra
What is RAG
Retrieval Augmented Generation. Allows you to customize a model’s response with new or up to date information but it is not a model customization method ( this involves changing the weights of the model)
What are agents? What are the types
Agents are autonomous computer programs that perform tasks. Main types are 1) intermediary operation agents 2) Actions launch agents 3) feedback agents
What is the quantitative evaluation of generative AI model?
Benchmarking datasets. Examples of measurements include accuracy, speed and efficiency and scalability
Instruction tuning
A type of fine tuning that involves retraining the model on a new dataset that consists of prompts followed by the desired output (highly effective for interactive applications like virtual assistants and chatbots)
Reinforcement learning from human feedback (RLHF)
- fine-tuning technique
- model is initially trained using supervised learning to predict human-like responses. Then, it is further refined through a reinforcement learning process, where a reward model built from human feedback guides the model toward generating more preferable outputs. Good for sensitive applications.
Adapting models for specific domains
This approach involves fine-tuning the model on a corpus of text or data that is specific to a particular industry or sector.
Transfer learning
This approach is a method where a model developed for one task is reused as the starting point for a model on a second task (highly efficient in using learned features and knowledge from the general training phase and applying them to a narrower scope with less additional training required.)
Continuous pretraining
Pre-training model by continuously feeding it new and emerging data.
Key steps in fine-tuning
1) Data curation (this involves a more rigorous selection process to ensure every piece of data is highly relevant.) 2) Labeling 3) Governance and compliance 4) Representativeness and bias checking 5) Feedback integration
ROUGE
ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a set of metrics used to evaluate automatic summarization of texts, in addition to machine translation quality in NLP. The main idea behind ROUGE is to count the number of overlapping units between the model and human text
ROUGE-L
This metric uses the longest common subsequence between the generated text and the reference texts. It is particularly good at evaluating the coherence and order of the narrative in the output
ROUGE-N
This metric measures the overlap of n-grams between the generated text and the reference text. For example, ROUGE-1 refers to the overlap of unigrams, ROUGE-2 refers to bigrams, and so on. This metric primarily assesses the fluency of the text and the extent to which it includes key ideas from the reference.
BLEU
a precision metric used to evaluate the quality of text that has been machine-translated from one natural language to another. BLEU measures the precision of N-grams in the machine-generated text that appears in the reference texts and applies a penalty for overly short translations (brevity penalty).
The BERTScore
BERTScore evaluates the semantic similarity rather than relying on exact lexical matches, it is capable of capturing meaning in a more nuanced manner. BERTScore uses the pretrained contextual embeddings from models like BERT to evaluate the quality of text-generation tasks. BERTScore computes the cosine similarity between the contextual embeddings of words in the candidate and the reference texts
Hyperparameter tuning
A method to adjust the behavior of an ML algorithm. You can make changes to an ML model by using hyperparameter tuning to modify the behavior of the algorithm.
Benchmark dataset
Curated collection of data designed specifically to evaluate the performance of a language model. Can be low cost way to evaluate bias
Amazon RAG Vector databases
1) Open Search Service - scalable index management & fastest KNN (nearest neighbor search capabilities)
2) Amazon DocumentDB - No sequal. Millions of vectors
3) Amazon Aurora (relational database)
4) Amazon TDS (relational database)
5) Amazon Neptune (graph database)
Model improvement technique costs - from least to most
1) Prompt Engineering (very cheap cause no additional computing or fine tuning)
2) Retrieval Augmented Generation RAG - (cheap bc don’t have to re-computer or fine tune but does require a vector database)
3) Instruction based fine tuning - (fine tuned with specific instructions - required additional computations)
4) Domain adoption fine-tuning (model trained on domain-specific, unlabeled data - intensive computation required).
What’s main driver of cost in Amazon Bedrock?
Number of input and output tokens
Side note: batch costs can provide up to 50% savings
What’s main driver of cost in Amazon Bedrock?
Number of input and output tokens
Side note: batch costs can provide up to 50% savings
Shapely value
Provides a local explanation by quantifying the contribution of each feature to the predictions
Partial Dependence Plots (PDP)
Provides global view of models behavior by illustrating how the predicted outcome changes as a single feature is varied
Partial Dependence Plots (PDP)
Provides global view of models behavior by illustrating how the predicted outcome changes as a single feature is varied
What is Sagemaker Clarify?
SageMaker Clarify is specifically designed to help identify and mitigate bias in machine learning models and datasets. It provides tools to analyze both data and model predictions to detect potential bias, generate reports, and help ensure that models are fair and transparent. It can help identify and measure bias within the data preparation stage and throughout the model’s lifecycle.
SageMaker Model Monitor
tracks performance metrics, data drift, and other factors that affect model accuracy and reliability
What can you use to establish a private connection between your Amazon Bedrock foundational model and your Amazon Virtual Private Cloud (Amazon VPC)?
AWS PrivateLink to not expose your traffic to the Internet.
Continued Pre-training
- provide unlabeled data to a foundation model by familiarizing it with certain types of inputs.
- You can provide data from specific topics to expose a model to those areas.
- tweaks the model parameters to accommodate the input data and improve its domain knowledge.