Optimizing Foundation Models Flashcards by Rob Zelada

Embedding is the process by which

text, images, and audio are given numerical representation in a vector space.

How well did you know this?

Not at all

Perfectly

Embedding is usually performed by

a machine learning (ML) model.

How well did you know this?

Not at all

Perfectly

Enterprise datasets, such as documents, images and audio, are passed to ML models as tokens and are vectorized. These vectors in an n-dimensional space, along with the metadata about them, are stored in purpose-built vector databases for faster retrieval.

How well did you know this?

Not at all

Perfectly

Two words that relate to each other will have similar

embeddings.

How well did you know this?

Not at all

Perfectly

Here is an example of two words: sea and ocean. They are randomly initialized and their early embeddings are diverse. As the training progresses, their embeddings become more

similar because they often appear close to each other and in similar context.

How well did you know this?

Not at all

Perfectly

The core function of vector databases is to

compactly store billions of high-dimensional vectors representing words and entities. Vector databases provide ultra-fast similarity searches across these billions of vectors in real time.

How well did you know this?

Not at all

Perfectly

The most common algorithms used to perform the similarity search are

k-nearest neighbors (k-NN)

cosine similarity.

How well did you know this?

Not at all

Perfectly

Agents- Intermediary operations:

Agents can act as intermediaries, facilitating communication between the generative AI model and various backend systems. The generative AI model handles language understanding and response generation. The various backend systems include items such as databases, CRM platforms, or service management tools.

How well did you know this?

Not at all

Perfectly

Agents - Action launch

Agents can be used to run a wide variety of tasks. These tasks might include adjusting service settings, processing transactions, retrieving documents, and more. These actions are based on the users’ specific needs understood by the generative AI model.

How well did you know this?

Not at all

Perfectly

Agents - Feedback integration

Agents can also contribute to the AI system’s learning process by collecting data on the outcomes of their actions. This feedback helps refine the AI model, enhancing its accuracy and effectiveness in future interactions.

How well did you know this?

Not at all

Perfectly

Human evaluation involves real users interacting with the AI model to provide feedback based on their experience. This method is particularly valuable for assessing qualitative aspects of the model, such as the following:

Human evaluation is often used for iterative improvements and tuning the model to better meet user expectations.

User experience: How intuitive and satisfying is the interaction with the model from the user’s perspective?
Contextual approriateness: Does the model respond in a way that is contextually relevant and sensitive to the nuances of human communication?
Creativity and flexibility: How well does the model handle unexpected queries or complex scenarios that require a nuanced understanding?

How well did you know this?

Not at all

Perfectly

Benchmark datasets, on the other hand, provide a quantitative way to evaluate generative AI models. These datasets consist of predefined datasets and associated metrics that offer a consistent, objective means to measure model performances, like

Accuracy
Speed and Efficiency
Scalability

How well did you know this?

Not at all

Perfectly

Creating a benchmark dataset is a

manual process that is necessary to properly evaluate LLM performances using RAG systems.

How well did you know this?

Not at all

Perfectly

In practice, a combination of

both human evaluation and benchmark datasets is often used to provide a comprehensive overview of a model’s performance.

How well did you know this?

Not at all

Perfectly

LLM as a judge

evaluation of LLM performance using a benchmark dataset can be automated using this

How well did you know this?

Not at all

Perfectly

Fine-tuning is critical because it helps

Study These Flashcards

Increase specificity:
Improve accuracy: Reduce biases:
Boost efficiency:

Fine-tuning - Instruction tuning

Study These Flashcards

This approach involves retraining the model on a new dataset that consists of prompts followed by the desired outputs. This is structured in a way that the model learns to follow specific instructions better. This method is particularly useful for improving the model’s ability to understand and execute user commands accurately, making it highly effective for interactive applications like virtual assistants and chatbots.

Fine-tuning: Reinforcement learning from human feedback (RLHF):

Study These Flashcards

This approach is a fine-tuning technique where the model is initially trained using supervised learning to predict human-like responses. Then, it is further refined through a reinforcement learning process, where a reward model built from human feedback guides the model toward generating more preferable outputs. This method is effective in aligning the model’s outputs with human values and preferences, thereby increasing its practical utility in sensitive applications.

Fine-tuning Adapting models for specific domains:

Study These Flashcards

This approach involves fine-tuning the model on a corpus of text or data that is specific to a particular industry or sector. An example of this would be legal documents for a legal AI or medical records for a healthcare AI. This specificity enables the model to perform with a higher degree of relevance and accuracy in domain-specific tasks, providing more useful and context-aware responses.

Fine-tuning Transfer Learning

Study These Flashcards

This approach is a method where a model developed for one task is reused as the starting point for a model on a second task. For foundational models, this often means taking a model that has been trained on a vast, general dataset, then fine-tuning it on a smaller, specific dataset. This method is highly efficient in using learned features and knowledge from the general training phase and applying them to a narrower scope with less additional training required.

Fine tuning Continuous pretraining:

Study These Flashcards

This approach involves extending the training phase of a pre-trained model by continuously feeding it new and emerging data. This approach is used to keep the model updated with the latest information, vocabulary, trends, or research findings, ensuring its outputs remain relevant and accurate over time.

The data preparation for fine-tuning is distinct from initial training due to the following reasons

Study These Flashcards

Specificity: The dataset for fine-tuning is much more focused, containing examples that are directly relevant to the specific tasks or problems the model needs to solve.
High relevance: Data must be highly relevant to the desired outputs. Examples include legal documents for a legal AI or customer service interactions for a customer support AI.
Quality over quantity: Although the initial training requires massive amounts of data, fine-tuning can often achieve significant improvements with much smaller, but well-curated datasets.

Key steps in fine-tuning data preparation Data Curation

Study These Flashcards

Data curation: Although it is a continuation, this involves a more rigorous selection process to ensure every piece of data is highly relevant. This step also ensures the data contributes to the model’s learning in the specific context.

ROUGE is a set of metrics used to evaluate

Study These Flashcards

automatic summarization of texts, in addition to machine translation quality in NLP.

The main idea behind ROUGE is to count the number of

overlapping units. This includes words, N-grams, or sentence fragments between the computer-generated output and a set of reference (human-created) texts.

ROUGE-N:

This metric primarily assesses the fluency of the text and the extent to which it includes key ideas from the reference.

ROUGE-L: This metric uses the

longest common subsequence between the generated text and the reference texts. It is particularly good at evaluating the coherence and order of the narrative in the outputs.

BLEU is a metric used to evaluate the

quality of text that has been machine-translated from one natural language to another.

BERTScore is increasingly used alongside traditional metrics like BLEU and ROUGE for a

more comprehensive assessment of language generation models. This is especially true in cases where capturing the deeper semantic meaning of the text is important. BERTScore evaluates the semantic similarity rather than relying on exact lexical matches, it is capable of capturing meaning in a more nuanced manner.

Key steps in fine-tuning data preparation: labeling

Labeling: In fine-tuning, the accuracy and relevance of labels are paramount. They guide the model's adjustments to specialize in the target domain.

Key steps in fine-tuning data preparation: governance and compliance

Governance and compliance: Considering fine-tuning often uses more specialized data, ensuring data governance and compliance with industry-specific regulations is critical. Representativeness and bias checking: It is essential to ensure that the fine-tuning dataset does not introduce or perpetuate biases that could skew the model's performance in undesirable ways.

Key steps in fine-tuning data preparation: Feedback integration:

For methods like RLHF, incorporating user or expert feedback directly into the training process is crucial. This is more nuanced and interactive than the initial training phase.

Amazon Augmented AI (Amazon A2I) is a service that makes it easy to build human review workflows for machine learning predictions.

It allows developers to incorporate human review into their machine learning applications to improve model accuracy and ensure compliance with regulatory or business requirements. With Amazon A2I, developers can create human review workflows, manage the workforce, and integrate human review into their applications.

Federal Risk and Authorization Management Program (FedRAMP) focuses on cloud services for federal agencies.

While relevant for cloud products, it does not provide the comprehensive security standards and guidelines that the NIST framework does for all federal information systems.

Amazon SageMaker Clarify is an essential tool for machine learning specialists who are concerned about

bias and transparency in their models. It helps detect bias at multiple stages, including during data preparation, after model training, and even during inference. Amazon SageMaker Clarify enhances model transparency by providing detailed explanations for ML predictions. It offers feature importance scores, illustrating how each input feature influences the model’s predictions. This transparency allows stakeholders to understand the reasoning behind the model’s decisions, thereby building confidence in the model’s outputs. Using SageMaker Clarify, organizations can meet regulatory requirements and ensure that their ML models are fair and interpretable.

Optimizing Foundation Models Flashcards

(35 cards)