Optimizing Foundation Models Flashcards
The most common algorithms used to perform similarity searchs for vector storing
1) k-nearest neighbors(k-NN) or cosine similarity
AWS vector database options
1) Amazon OpenSearch Service (provisioned)
2) Amazon OpenSearch Serverless
3) pgvector extension in Amazon Relational Database Service (Amazon RDS) for PostgreSQL
4) pgvector extension in Amazon Aurora PostgreSQL-Compatible Edition
5) Amazon Kendra
What is RAG
Retrieval Augmented Generation
What are agents? What are the types
Agents are autonomous computer programs that perform tasks. Main types are 1) intermediary operation agents 2) Actions launch agents 3) feedback agents
What is the quantitative evaluation of generative AI model?
Benchmarking datasets. Examples of measurements include accuracy, speed and efficiency and scalability
Instruction tuning
A type of fine tuning that involves retraining the model on a new dataset that consists of prompts followed by the desired output (highly effective for interactive applications like virtual assistants and chatbots)
Reinforcement learning from human feedback (RLHF)
- fine-tuning technique
- model is initially trained using supervised learning to predict human-like responses. Then, it is further refined through a reinforcement learning process, where a reward model built from human feedback guides the model toward generating more preferable outputs. Good for sensitive applications.
Adapting models for specific domains
This approach involves fine-tuning the model on a corpus of text or data that is specific to a particular industry or sector.
Transfer learning
This approach is a method where a model developed for one task is reused as the starting point for a model on a second task (highly efficient in using learned features and knowledge from the general training phase and applying them to a narrower scope with less additional training required.)
Continuous pretraining
Pre-training model by continuously feeding it new and emerging data.
Key steps in fine-tuning
1) Data curation (this involves a more rigorous selection process to ensure every piece of data is highly relevant.) 2) Labeling 3) Governance and compliance 4) Representativeness and bias checking 5) Feedback integration
ROUGE
ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a set of metrics used to evaluate automatic summarization of texts, in addition to machine translation quality in NLP. The main idea behind ROUGE is to count the number of overlapping units between the model and human text
ROUGE-L
This metric uses the longest common subsequence between the generated text and the reference texts. It is particularly good at evaluating the coherence and order of the narrative in the output
ROUGE-N
This metric measures the overlap of n-grams between the generated text and the reference text. For example, ROUGE-1 refers to the overlap of unigrams, ROUGE-2 refers to bigrams, and so on. This metric primarily assesses the fluency of the text and the extent to which it includes key ideas from the reference.
BLEU
a precision metric used to evaluate the quality of text that has been machine-translated from one natural language to another. BLEU measures the precision of N-grams in the machine-generated text that appears in the reference texts and applies a penalty for overly short translations (brevity penalty).