Domain 3: Application of Foundation Models 28% Flashcards by Natasha WrightPope

What does finding the balance between training time, cost, and model performance yield?

An efficient scalable solution that does not reduce model performance.

How well did you know this?

Not at all

Perfectly

Cost, latency constraints, and required modalities

considerations for apps that use FMs

How well did you know this?

Not at all

Perfectly

Understanding the requirements for your use case in terms of _____ is important when deciding on AI model.

inference speed

How well did you know this?

Not at all

Perfectly

Cost

find the balance between training time, cost, and model performance

How well did you know this?

Not at all

Perfectly

Latency

Consider real-time results requirements, inference times

How well did you know this?

Not at all

Perfectly

_____ is the duration it takes a model to process data and produce a prediction.

Inference speed

How well did you know this?

Not at all

Perfectly

Modalities

specific embedding, multi-model, multilingual, pre-trained (architecture/complexity)

How well did you know this?

Not at all

Perfectly

Accuracy, precision, recall, F1 score, root mean squared error or RMSE, mean average precision or MAP, and mean absolute error, MAE.

standard metrics to evaluate and compare different models

How well did you know this?

Not at all

Perfectly

It’s important when you are choosing an appropriate metric or set of metrics to _____ before selecting a model.

assess your model’s performance

How well did you know this?

Not at all

Perfectly

Framework, language, environment, license, documentation, whether the model has been updated and maintained regularly, known issues or limitations, customization, and explainability.

Compatibilities should you consider when using a pre-trained model online

How well did you know this?

Not at all

Perfectly

flexible, modular, transparent, provide tools or methods to visualize or interpret their inner workings, interpret and explain model outcomes

What to look for when considering a pre-trained model

How well did you know this?

Not at all

Perfectly

T/F: Foundation models are not interpretable by design because they are extremely complex.

True

How well did you know this?

Not at all

Perfectly

_____ attempts to explain the black box nature of FMs, by approximating it locally with a simpler model that is interpretable.

Explainability

How well did you know this?

Not at all

Perfectly

T/F: If interpretability is a requirement, then pre-trained foundation models might not be the best choice.

True

How well did you know this?

Not at all

Perfectly

T/F: Linear regression and decision trees might be better when it comes to explainability.

True

How well did you know this?

Not at all

Perfectly

The complexity of a model is important and can help you uncover intricate patterns within the data, but it can add challenges to _____ and _____.

maintenance and interpretability

How well did you know this?

Not at all

Perfectly

T/F: Greater complexity might lead to enhanced performance, but can increase costs.

True

How well did you know this?

Not at all

Perfectly

T/F: The more complicated the model is, the harder it is to explain the outputs of the model.

True

How well did you know this?

Not at all

Perfectly

The _____ is where you process new data through the model to make predictions. It is the process of generating an output from an input that you provided to model.

inference

How well did you know this?

Not at all

Perfectly

_____ gives you the ability to run inference in the foundation model you choose.

Amazon Bedrock

How well did you know this?

Not at all

Perfectly

A _____, which is an input, is provided to the model for it to generate a response.

prompt

How well did you know this?

Not at all

Perfectly

_____ are a set of values that can be adjusted to limit or influence the model response.

inference parameters

How well did you know this?

Not at all

Perfectly

What kind of models can you run inference with?

base, custom, and provision to test FM responses

How well did you know this?

Not at all

Perfectly

Amazon Bedrock foundation models support the inference parameters of _____ to control randomness and diversity in the response.

temperature, Top K, Top P

How well did you know this?

Not at all

Perfectly

What parameters are supported by Bedrock to limit the length of responses

response length, penalties, and stop sequences

These inputs guide LLMS to generate an appropriate response or output for a given task or instruction.

Prompts

You can integrate additional domain-specific data from these data stores or vector data stores that add to your prompts semantically relevant inputs.

retrieval augmented generation, RAG

A _____ is a collection of data that is stored as mathematical representations.

vector database

It requires millions of graphic processing units, GPUs, compute hours, terabytes and petabytes of data, trillions of tokens, trial and error, and time; generative AI models learn its capabilities.

pre-training

_____ add additional capabilities for efficient and fast lookup, and to provide data management, fault tolerance, authentication, access control, and query engine.

Vector databases

_____ enhances language models to retrieve and use external knowledge during the generation process. It is a technique in which the retrieval of information from data sources augments the generation of model responses.

RAG

RAG combines two components, a _____ component, which searches through a knowledge base and a _____ component, which produces outputs based on the retrieved information.

retriever / generator

Why does RAG combine two components?

helps the model access up-to-date and domain-specific knowledge beyond their training data

Prompt is passed into the query encoder, which encodes or embeds the data into the same format as the external data. Then the embedding can be passed to the vector database to search and return similar embeddings that have been through the model. Those embeddings are then attached to my new query and can also be mapped back to their original location. If the vector database finds similar data, then the retriever retrieves that data, the LLM combines or augments the new data or text with the original prompt, then the prompt is sent to the LLM to return a completion.

How to use a vector database in the real world

How does RAG solve hallucinations?

By complimenting generative LLMs with an external knowledge base that is typically built using a vector database, hydrated with vector-coded knowledge articles

Amazon OpenSearch Service, Amazon Aurora, Redis, Amazon Neptune, Amazon DocumentDB with MongoDB compatibility, and Amazon RDS with PostgreSQL

AWS services that help store embeddings within vector databases

The _____ delivers low-latency search and aggregations, dashboards, visualization, and dashboarding tools. It also has plugins that provide advanced capabilities such as alerting, fine-grained access control, observability, security monitoring and vector storage and processing. With this service's vector database capabilities, you can implement semantic search, retrieval of augmented generation, RAG with LLMs, recommendation engines, and search media too.

OpenSearch search engine

With _____ you can securely connect foundation models, FMs, to your company data. It is stored as embeddings in the vector engine for more relevant, context-specific, and accurate responses without continuously re-training the FM. Amazon RDS for PostgreSQL also supports the pgvector extension to store embeddings and perform efficient searches.

a fully managed RAG offered by knowledge bases for Amazon Bedrock

1. A fully managed AI capability from AWS to help you build applications foundation models. 2. Can automatically break down tasks and generate the required orchestration logic or write custom code, and they can securely connect to your databases through APIs. 3. They can ingest and structure the data for machine consumption and augment it with contextual details to produce more accurate responses and fulfill requests. 4. They are an additional piece of software that orchestrates the prompt completion workflows and interactions between the user requests, foundation model, and external data sources or applications. 5. They automatically call APIs to take actions and invoke knowledge bases to supplement information for these actions.

Agents for Amazon Bedrock

_____ are a specific set of inputs provided by you the user. They guide LLMs to generate an appropriate response or output for a given task or instruction.

Prompts

A _____ contains components that you want the LLM to perform such as the task or instruction. You might also need the context of that task or instruction and the input text that you want for the response or output.

prompt

When you provide a few examples to help the LLM models better perform and calibrate their output to meet your expectations.

few-shot prompting

A sentiment classification prompt with no examples provided to the prompt.

zero-shot prompting

Where the actual prompt text is replaced with a continuous embedding backer that is optimized during training. This technique helps the prompt to be fine-tuned for a specific task. At the same time, it keeps the rest of the model parameters frozen, which can be more efficient than full fine-tuning.

prompt tuning

The practice of crafting and optimizing input prompts. It selects appropriate words, phrases, sentences, punctuation, and separator characters to effectively use LLMs for a wide variety of applications.

prompt engineering

Classification, question and answer with and without context, summarization, open-ended text generation, code generation, math, and reasoning or logical thinking.

common tasks supported by LLMs on Amazon Bedrock

This is: 1. The encoded knowledge of language in an LLM. 2. The stored patterns of data that capture relationships and, when prompted, reconstruct language from those patterns. 3. An understanding of patterns that the model can use to generate new outputs. 4. A statistical database.

Latent space

When you write a prompt for a language model, that prompt is ingested by the model and _____. It returns a pile of statistics that then get assembled as words.

refers to its latent space against its database of statistics

Designing and refining the input prompts that are fed into the model to guide it towards producing the desired outputs.

prompt engineering

1. Be specific and provide clear instructions or specifications for the task at hand. For example, include the desired format, examples, comparison, style, tone, output length, and detailed context. 2. Include examples of the desired behavior and direction, such as sample texts, data formats, templates, code, graphs, charts, and more. 3. Experiment and use an iterative process to test prompts and understand how the modifications alter the responses. 4. Know the strengths and weaknesses of your model. 5. Balance simplicity and complexity in your prompts to avoid vague, unrelated, or unexpected answers. 6. Specifically for your prompt engineers, use multiple comments to offer more context without cluttering your prompt. 7. Add guardrails.

prompt engineering techniques

Attacks of prompt manipulation with an untrusted input that is created by a user to produce malicious, undesired, or elicit response.

prompt injection

When an attacker tries to bypass the guardrails that you have established, this is called _____.

jailbreaking

_____ is an attempt to change or manipulate the original prompt with new instructions.

Hijacking

_____ is another risk of prompt engineering where harmful instructions are embedded in messages, emails, web pages, and more.

Poisoning

1. Use these services to build applications that generate high-quality text for use cases such as content creation summarization, question answering and chatbots. 2. Offer pre-trained language models that can be customized and controlled through prompt engineering. 3. They provide APIs and tools for constructing and refining prompts, along with monitoring and analyzing the resulting outputs.

Amazon Bedrock and Amazon Titan

What are the key elements of training a foundation model?

They include pre-training, fine-tuning, and continuous pre-training.

With _____, you train the LLM by using huge amounts of unstructured data with self-supervised learning.

pre-training

_____ is a process that extends the training of the model to improve the generation of completions for a specific task. It is a supervised learning process and you use a dataset of labeled examples to update the weights of the LLM, it helps to adapt foundation models to your custom datasets and use cases.

Fine-tuning

_____ happens when the whole fine-tuning process modifies the weights of the original LLM. This can improve the performance of the single task fine-tuning, but it can degrade performance on other tasks.

Catastrophic forgetting

Load the model parameters and add memory for the optimizer, gradients, forward activations, and temporal memory.

How to train and tune a foundation model

_____ is a process and set of techniques that freeze or preserve the parameters and weights of the original LLM and fine-tune or train a small number of task-specific adaptor layers and parameters. It reduces the compute and memory that's needed because it's fine-tuning a small set of model parameters.

Parameter-efficient fine-tuning, PEFT

_____ is a popular PEFT technique that also preserves or freezes the original weights of the foundation model and creates new trainable low-rank matrices into each layer of a transformer architecture.

Low-rank adaptation or LoRA

PEFT and LoRA modify the _____ of your model, but not the representations.

weights

_____ encode semantic information similar to embeddings.

Representations

_____ is a fine-tuning process that freezes the base model and learns task-specific interventions on hidden representations.

Representation fine-tuning, ReFT

The _____ says that concepts are encoded in linear subspaces of representation in a neural network.

linear representation hypothesis

_____ is an extension of fine-tuning a single task. This requires a lot of data. For this process, the training dataset has examples of inputs and outputs for multiple tasks.

Multitask fine-tuning

_____ gives you the ability to use the pre-trained foundation models and adapt them to specific tasks by using limited domain-specific data. You can use this to help your model work with domain-specific language such as industry jargon, technical terms, or other specialized data.

Domain adaptation fine-tuning

_____ provides the capability to fine-tune a large language model, particularly a text generation model, on a domain- specific dataset so you can improve the performance of your model and help it better understand human-like prompts to generate human-like responses.

Amazon SageMaker JumpStart

During _____, you select prompts from your training dataset and pass them to the LLM to generate completions. Then, compare the distribution of completions, and the training label, to calculate a loss between the two token distributions, which you can use to update your model's weights so the model's performance on the task improves.

fine-tuning

You can define separate evaluation steps to measure your LLM's performance, by using the _____. You will get the validation accuracy, and after you've completed your fine-tuning, you can perform a final performance evaluation by using this, and the last result will give you the test accuracy.

holdout validation dataset

In machine learning, _____ is the collecting, pre-processing, and organizing of raw data for your model.

data preparation

If you have low-code data preparation, you can use _____ to create data flows that define your ML data pre-processing.

Amazon SageMaker Canvas

If you have _____ that needs to scale, you can use open source frameworks such as Apache Spark, Apache Hive, or Presto.

data preparation

If you need to use structured query language (SQL) in SageMaker Studio for data preparation, you can use _____.

Jupyter Lab

If you have data preparation for feature discovery and storage, you can use _____ to search, discover, and retrieve features for model training. You can also use it to provide a centralized repository to store feature data in a standardized format.

Amazon SageMaker Feature Store

You can use _____ to analyze your data and detect potential biases across multiple facets, which can help you detect whether your training data contains imbalanced representations or labeling biases between groups such as gender, race, or age.

Amazon SageMaker Clarify

If you have data that needs to be labeled, you can use _____ to manage the data labeling workflows for your training datasets.

SageMaker Ground Truth

T/F: The output of generative AI models is non-deterministic, which makes validation more difficult.

True

1. Improve application performance by reducing the size of the LLMs. This action can reduce the inference latency because the smaller size model loads more quickly. However, remember that reducing the size of the model might decrease its performance. 2. Make a more concise prompt, reducing the size of the retrieved snippets and their number, and reducing generation through inference parameters and prompt.

optimization techniques

_____ is a set of metrics and a software package. It is used to evaluate automatic summarization tasks and machine translation software in natural language processing. It evaluates how well the input compares to the generated output.

Recall Oriented Understudy for Gisting Evaluation, or ROUGE

_____ is an algorithm that is used for translation tasks. It evaluates the quality of text which has been machine translated from one natural language to another.

Bilingual Evaluation Understudy, or BLEU

_____ was created to help the development of models that can generalize across multiple tasks. It is a collection of natural language tasks, such as sentiment analysis and question answering. You can use these tasks to evaluate and compare model performance across a set of language tasks. Then, you can use the benchmark to measure and compare the model performance.

GLUE

_____ was introduced in 2019 and adds additional tasks, such as multi-sentence reasoning and reading comprehension.

SuperGlue

_____ evaluates the knowledge and problem-solving capabilities of the model.

Massive Multitask Language Understanding, MMLU

____ focuses on tasks that are beyond the capabilities of the current language models. It contains tasks such as math, biology, physics, bias, linguistics, reasoning, childhood development, software development, and more.

The Beyond the Imitation Game Benchmark, BIG-bench

Another benchmark is the _____ which is a benchmark to help improve model transparency. It offers users guidance on which model performs well for a given task. This is a combination of metrics for tasks such as summarization, question and answer, sentiment analysis, and bias detection.

Holistic Evaluation of Language Models, HELM

You can also use _____ to manually evaluate your model responses. For example, you can use these to compare the responses of SageMaker JumpStart models, and you can also specify responses from models outside AWS.

human workers

You can use _____ to evaluate LLMs and create model evaluation jobs. A model evaluation job helps to evaluate and compare model quality and metrics for text-based foundation models from SageMaker JumpStart.

Amazon SageMaker Clarify

_____ provides an evaluation module that can automatically compare generated responses and calculate a semantic similarity base score, BERTscore, against a human reference. It is suitable to evaluate faithfulness and hallucinations in text-generation tasks

Amazon Bedrock

_____ helps with the challenge of internal knowledge of models being outdated. If your model is outdated, then it'll not know newer information. This helps by providing a context, which helps to avoid hallucinations and improve factuality by grounding responses.

RAG

1. Use an orchestration library to configure and manage the passing of user input to the large language model and the return of completions. 2. RAG helps your model access additional external data at inference times. And the additional external data can help improve the relevance and accuracy of completions. 3. RAG helps overcome the outdated knowledge issue if your model uses older information.

Actions to give you more configurations to connect your LLM to external components and integrate deployment within your application

1. define business goals 2. determine metrics then measure, monitor, review them 3. make sure models interact well w/ your other systems (have to interact in real time using APIs and interfaces) 4. choose the large language models to use with your application and the appropriate infrastructure for your inference needs (remember storage) 5. consider additional tools/frameworks for LLMs 6. user interface (like website/rest API) to consume app

Primary AI considerations

The _____provides the compute, storage, and network to serve and host your LLMs and to host your application components. For this layer, you must ensure that your data is being handled securely across the AI lifecycle for data preparation, training, and inferencing.

infrastructure layer

Cost, modality, latency, multilingual, model size, model complexity, customization, input, and output length

selection criteria to choose pre-trained models

The duration it takes a model to process data and produce a prediction.

inference speed

When does KNN perform most of its computational work?

during the inference stage

_____ combine several models to achieve better performance than a single model

ensemble methods

The complexity of the model can be measured by _____.

the number of parameters, layers, and operations

Measures how well the model can locate and classify multiple objects in an image.

MAP

T/F: Accuracy is not recommended with datasets that are not evenly distributed or imbalanced.

True

How does explainability attempt to explain black box/foundation model

by comparing it locally to a simpler model that is interpretable

What helps control the behavior and output characteristics of the foundation models?

Inference parameters

What gives you the ability to run inference in the foundation model you choose?

Bedrock

A set of values that can be adjusted to limit or influence the model response.

inference parameters

Why do you consider different parameter settings?

diversity, coherence, and resource efficiency

Guide LLMS to generate an appropriate response or output for a given task or instruction.

prompts

Adding contextual data from your internal databases to enrich the prompt, and integrating additional domain-specific data from these data stores or vector data stores that add to your prompts semantically relevant inputs.

RAG

A collection of data that is stored as mathematical representations that store structured and unstructured data, such as text or images with the vector embeddings.

Vector database

_____ are a way to convert words and sentences and other data into numbers. They are the numerical representation of that data that represents the meaning and relationships.

Vector embeddings

T/F: A machine learning model is a prerequisite to create a vector database and the indexing technology itself

True

_____ are the factual reference of foundation model based applications, helping the model retrieve trustworthy data. Foundation models use _____ as an external data source to improve their capabilities by enhancing search recommendations and text generation use cases.

Vector databases

Add additional capabilities for efficient and fast lookup, and to provide data management, fault tolerance, authentication, and access control and query engine.

Vector database

T/F: RAG is a technique in which the retrieval of information from data sources augments the generation of model responses.

True

RAG solves _____ by complimenting generative LLMs with an external knowledge base that is typically built using a vector database, hydrated with vector-coded knowledge articles.

hallucinations

_____can automatically break down tasks and generate the required orchestration logic or write custom code, and agents can securely connect to your databases through APIs, they can ingest and structure the data for machine consumption and augment it with contextual details to produce more accurate responses and fulfill requests.

Agents

An additional piece of software that orchestrates the prompt completion workflows and interactions between the user requests, foundation model, and external data sources or applications.

Agent

Use this to break down the reasoning process into intermediate steps.

Chain-of-thought prompting

Prompt tuning

The encoded knowledge of language in a large language model. It's the stored patterns of data that capture relationships and, when prompted, reconstruct language from those patterns.

Latent space

An understanding of patterns that the model can use to generate new outputs, and it's a statistical database.

latent space

RefinedWeb, Common Crawl, StarCoder data, BookCorpus, Wikipedia, C4

examples of large text databases that train models

T/F: A model that doesn't know the exact specifics of a prompt because the knowledge isn't in its latent space will choose the closest match.

True

What are the key elements of training a foundation model?

pre-training, fine-tuning, and continuous pre-training

A supervised learning process where you use a dataset of labeled examples to update the weights of the LLM. Also helps to adapt foundation models to your custom datasets and use cases.

Fine-tuning

Uses labeled examples for performance improvements on specific tasks.

Instruction based fine-tuning

Every parameter in the model is updated through supervised learning

Full fine-tuning

_____ says that concepts are encoded in linear subspaces of representation in a neural network.

The linear representation hypothesis

An extension of fine-tuning a single task that requires a lot of data. For this process, the training dataset has examples of inputs and outputs for multiple tasks, which produces an instruction tuned model that has learned how to complete many different tasks simultaneously.

Multitask fine-tuning

How do you avoid catastrophic forgetting?

Calculate losses from training dataset examples of inputs and outputs to update the weights of the model.

What fine-tuning process modifies the weights of the model to adapt to domain-specific data?

Domain adaptation fine-tuning

Gives you the ability to use the pre-trained foundation models and adapt them to specific tasks by using limited domain-specific data. You can use this to help your model work with domain-specific language such as industry jargon, technical terms, or other specialized data.

Domain adaptation fine-tuning

What provides the capability to fine-tune a large language model, particularly a text generation model on the domain- specific dataset.

Amazon SageMaker JumpStart

After your instruction dataset is ready, you can divide the dataset into _____ and _____.

training validation and test splits

fine-tuning

1. After your instruction dataset is ready, you can divide the dataset into training validation, and test splits. 2. During fine-tuning, you select prompts from your training dataset and pass them to the LLM to generate completions. 3. Then, compare the distribution of completions, and the training label, to calculate a loss between the two token distributions. 4. You can use the calculated loss to update your model's weights. 5. After many batches of prompt completion pairs, update the weights so the model's performance on the task improves.

How to prepare training data

As in standard supervised learning, you can define separate evaluation steps to measure your LLM's performance, by using the _____.

holdout validation dataset

After you've completed your fine-tuning, you can perform a final performance evaluation by using the holdout test dataset. This last result will give you the _____.

test accuracy

The collecting pre-processing, and organizing of your raw data for your model.

Data preparation

One optimization technique is to improve application performance by _____.

reducing the size of the LLMs.

Making a more concise prompt, reducing the size of the retrieved snippets and their number, and reducing generation through inference parameters and prompt.

Optimization technique

General Language Understanding Evaluation, GLUE, Holistic Evaluation of Language Models, HELM, Massive Multitask Language Understanding, MMLU, and Beyond the Imitation Game Benchmark, BIG-bench.

Use to evaluate and compare LLMs without a task-specific focus

What does SuperGlue add to GLUE?

multi-sentence reasoning and reading comprehension

Evaluates the knowledge and problem-solving capabilities of the model.

Massive Multitask Language Understanding, MMLU

_____ focuses on tasks that are beyond the capabilities of the current language models. It contains tasks such as math, biology, physics, bias, linguistics, reasoning, childhood development, software development, and more.

The Beyond the Imitation Game Benchmark, BIG-bench

_____ which is a benchmark to help improve model transparency. It offers users guidance on which model performs well for a given task. This is a combination of metrics for tasks such as summarization, question and answer, sentiment analysis, and bias detection.

Holistic Evaluation of Language Models, HELM

You can use _____ to configure and manage the passing of user input to the large language model and the return of completions.

an orchestration library

The _____ layer provides the compute, storage, and network to serve and host your LLMs and to host your application components.

infrastructure

Key components to build end-to-end solutions for your application:

1. Infrastructure layer 2. Choose LLM 3. Additional tools/frameworks 4. User interface

What are the categories of inference parameters?

Randomness and diversity Length

Under randomness and diversity, what is the temperature parameter?

It influences how creative or predictable the model's output will be.

The number of most likely candidates that the model considers for the next token.

Top K inference parameter

The percentage of most likely candidates that the model considers for the next token.

Top P inference parameter

The prompt includes a question with several possible choices for the answer, and the model must respond with the correct choice. An example of a classification use case is sentiment analysis: The input is a text passage, and the model must classify the sentiment of the text, such as whether it's positive or negative, harmless or toxic.

classification for prompt engineering use cases

The model must answer the question with its internal knowledge without any context or document.

question-answer without context for prompt engineering use cases

The user provides an input text with a question, and the model must answer the question based on information provided within the input text.

question-answer with context for prompt engineering use cases

The prompt is a passage of text, and the model must respond with a shorter passage that captures the main points of the input.

summarization for prompt engineering use cases

Given a prompt, the model must respond with a passage of original text that matches the description. This also includes the generation of creative text such as stories, poems, or movie scripts.

open-ended text generation for prompt engineering use cases

The model must generate code based on user specifications. For example, a prompt could request text-to-SQL or text-to-Python code generation.

code generation for prompt engineering use cases

The input describes a problem that requires mathematical reasoning at some level, which might be numerical, logical, geometric, or otherwise.

mathematics for prompt engineering use cases

The model must make a series of logical deductions.

reasoning or logical thinking for prompt engineering use cases

What AWS SageMaker feature can help you to harness the power of human feedback across the ML lifecycle to improve the accuracy and relevancy of models?

SageMaker Ground Truth

It is the collecting, preprocessing, and organizing of raw data for your model.

data preparation

Tasks that FMs can perform include:

language processing, visual comprehension, code generation, and human-centered engagement

Released in 2018, _____ was one of the first foundation models. This is a bidirectional model that analyzes the context of a complete sequence then makes a prediction. It was trained on a plain text corpus and Wikipedia using 3.3 billion tokens (words) and 340 million parameters. It can answer questions, predict sentences, and translate texts.

Bidirectional Encoder Representations from Transformers (BERT)

The _____ model was developed by OpenAI in 2018. It uses a 12-layer transformer decoder with a self-attention mechanism. And it was trained on the BookCorpus dataset, which holds over 11,000 free novels. A notable feature of version 1 is the ability to do zero-shot learning. The second version released in 2019, and it was trained using 1.5 billion parameters (compared to the 117 million parameters used previously). The 3rd version has a 96-layer neural network and 175 billion parameters and is trained using the 500-billion-word Common Crawl dataset. The popular chatbot is based on version 3.5. And the latest version launched in late 2022 and successfully passed the Uniform Bar Examination with a score of 297 (76%).

Generative Pre-trained Transformer (GPT)

FMs that are pretrained on large datasets, making them powerful, general-purpose models. They can be used as is or customized privately with company-specific data for a particular task without annotating large volumes of data. Initially, it will offer two models. The first is a generative LLM for tasks such as summarization, text generation, classification, open-ended Q&A, and information extraction. The second is an embeddings LLM that translates text inputs including words, phrases, and large units of text into numerical representations (known as embeddings) that contain the semantic meaning of the text. While this LLM will not generate text, it is useful for applications like personalization and search because by comparing embeddings the model will produce more relevant and contextual responses than word matching. To continue supporting best practices in the responsible use of AI, These FMs are built to detect and remove harmful content in the data, reject inappropriate content in the user input, and filter the models’ outputs that contain inappropriate content such as hate speech, profanity, and violence.

Amazon Titan

A text-to-image model that can generate realistic-looking, high-definition images. It was released in 2022 and has a diffusion model that uses noising and denoising technologies to learn how to create images. The model is smaller than competing diffusion technologies, like DALL-E 2, which means it does not need an extensive computing infrastructure. Stable Diffusion will run on a normal graphics card or even on a smartphone with a Snapdragon Gen2 platform.

Stable Difussion

_____ is a platform that offers open-source tools for you to build and deploy machine learning models. It acts as a community hub, and developers can share and explore models and datasets. Membership for individuals is free, although paid subscriptions offer higher levels of access. You have public access to nearly 200,000 models and 30,000 datasets.

Hugging Face

Infrastructure requirements, front-end dev, lack of comprehension, unreliable answers, bias

Challenges w/ FMs

When running model inference, you can adjust _____ to influence the model response. They can change the pool of possible outputs that the model considers during generation, or they can limit the final response.

inference parameters

Affects the shape of the probability distribution for the predicted output and influences the likelihood of the model selecting lower-probability outputs.

Temperature

How do you control randomness and diversity in a model?

by limiting or adjusting the distribution

With _____, you can integrate proprietary information into your generative-AI applications. When a query is made, a knowledge base searches your data to find relevant information to answer the query. The retrieved information can then be used to improve generated responses. You can build your own RAG-based application by using the capabilities of this.

Amazon Bedrock Knowledge Bases

_____ orchestrate interactions between foundation models (FMs), data sources, software applications, and user conversations. In addition, they automatically call APIs to take actions and invoke knowledge bases to supplement information for these actions. Developers can save weeks of development effort by integrating these to accelerate the delivery of generative artificial intelligence (generative AI) applications .

Agents

1. Extend foundation models to understand user requests and break down the tasks that the agent must perform into smaller steps. 2. Collect additional information from a user through natural conversation. 3. Take actions to fulfill a customer's request by making API calls to your company systems. 4. Augment performance and accuracy by querying data sources.

Agents

1. create knowledge base to store private data 2. configure for use case and add an action group or associate a knowledge base 3. Modify prompt templates to customize behavior 4. Test in Amazon Bedrock or thru API calls 5. Create alias to point to version of agent 6. Set up app to make API calls to agent Iterate and create more versions and aliases as necessary

How to use an agent

Domain 3: Application of Foundation Models 28% Flashcards

(178 cards)