AI Practice Test #1 (OLD) Flashcards
Amazon Bedrock
https://aws.amazon.com/bedrock/agents/
https://aws.amazon.com/bedrock/faqs/
https://docs.aws.amazon.com/bedrock/latest/userguide/general-guidelines-for-bedrock-users.html
Agents for Amazon Bedrock
Agents for Amazon Bedrock are fully managed capabilities that make it easier for developers to create generative AI-based applications that can complete complex tasks for a wide range of use cases and deliver up-to-date answers based on proprietary knowledge sources.
Agents are software components or entities designed to autonomously or semi-autonomously perform specific actions or tasks based on predefined rules or algorithms. With Amazon Bedrock, agents are utilized to manage and execute various multi-step tasks related to infrastructure provisioning, application deployment, and operational activities. For example, you can create an agent that helps customers process insurance claims or an agent that helps customers make travel reservations. You don’t have to provision capacity, manage infrastructure, or write custom code. Amazon Bedrock manages prompt engineering, memory, monitoring, encryption, user permissions, and API invocation.
https://docs.aws.amazon.com/bedrock/latest/userguide/agents.html
Knowledge Bases for Amazon Bedrock
With Knowledge Bases for Amazon Bedrock, you can give FMs and agents contextual information from your company’s private data sources for Retrieval Augmented Generation (RAG) to deliver more relevant, accurate, and customized responses. You cannot use Knowledge Bases for Amazon Bedrock for the given use case.
Watermark detection for Amazon Bedrock
The watermark detection mechanism allows you to identify images generated by Amazon Titan Image Generator, a foundation model that allows users to create realistic, studio-quality images in large volumes and at low cost, using natural language prompts. With watermark detection, you can increase transparency around AI-generated content by mitigating harmful content generation and reducing the spread of misinformation. You cannot use watermark detection for the given use case.
Guardrails for Amazon Bedrock
Guardrails for Amazon Bedrock help you implement safeguards for your generative AI applications based on your use cases and responsible AI policies. It helps control the interaction between users and FMs by filtering undesirable and harmful content, redacts personally identifiable information (PII), and enhances content safety and privacy in generative AI applications. You cannot use Guardrails for Amazon Bedrock for the given use case.
Generative AI
Generative AI can automate the creation of new data based on existing patterns, enhancing productivity and innovation
Generative AI in the AWS cloud environment is advantageous because it automates the creation of new data from existing patterns, which can significantly boost productivity and drive innovation. This capability allows businesses to generate new insights, designs, and solutions more efficiently.
via - https://aws.amazon.com/what-is/generative-ai/
Incorrect options:
Generative AI can replace all human roles in software development - Generative AI is not designed to replace all human roles in software development but to assist and enhance human capabilities by automating certain tasks and creating new data based on patterns. So, this option is incorrect.
Generative AI ensures 100% security against all cyber threats - While generative AI can improve security by identifying patterns and anomalies, it does not guarantee 100% security against all cyber threats. Security in the cloud involves a combination of multiple strategies and tools. Therefore, this option is incorrect.
Generative AI can perform all cloud maintenance tasks without any human intervention - Generative AI can assist in cloud maintenance tasks by predicting issues and suggesting solutions, but it cannot perform all maintenance tasks without human oversight and intervention. So, this option is not the right fit.
References:
https://aws.amazon.com/what-is/generative-ai/
https://aws.amazon.com/ai/generative-ai/services/
Prompt Engineering
https://aws.amazon.com/what-is/prompt-engineering/
Negative Prompting
Negative prompting refers to guiding a generative AI model to avoid certain outputs or behaviors when generating content. In the context of AWS generative AI, like those using Amazon Bedrock, negative prompting is used to refine and control the output of models by specifying what should not be included in the generated content.
Few-shot Prompting
In few-shot prompting, you provide a few examples of a task to the model to guide its output.
Chain-of-thought prompting
Chain-of-thought prompting is a technique that breaks down a complex question into smaller, logical parts that mimic a train of thought. This helps the model solve problems in a series of intermediate steps rather than directly answering the question. This enhances its reasoning ability. It involves guiding the model through a step-by-step process to arrive at a solution or generate content, thereby enhancing the quality and coherence of the output.
Zero-shot Prompting
Zero-shot prompting is a technique used in generative AI where the model is asked to perform a task or generate content without having seen any examples of that specific task during training. Instead, the model relies on its general understanding and knowledge to respond.
GPT
Generative Pre-trained Transformer
The company should use GPT (Generative Pre-trained Transformer), to interpret natural language inputs and generating coherent outputs, such as SQL queries, by leveraging its understanding of language patterns and structures
This is the correct option because GPT models are specifically designed to process and generate human-like text based on context and input data. GPT can be fine-tuned to understand specific domain language and generate accurate SQL queries from plain text input. It uses advanced natural language processing (NLP) techniques to parse input text, understand user intent, and generate the appropriate SQL statements, making it highly suitable for the task.
https://aws.amazon.com/what-is/gpt/
GAN
Generative Adversarial Network
A generative adversarial network (GAN) is a deep learning architecture. It trains two neural networks to compete against each other to generate more authentic new data from a given training dataset. For instance, you can generate new images from an existing image database or original music from a database of songs. A GAN is called adversarial because it trains two different networks and pits them against each other. One network generates new data by taking an input data sample and modifying it as much as possible. The other network tries to predict whether the generated data output belongs in the original dataset. In other words, the predicting network determines whether the generated data is fake or real. The system generates newer, improved versions of fake data values until the predicting network can no longer distinguish fake from original.
via - https://aws.amazon.com/what-is/gan/
Amazon Comprehend
Amazon Comprehend is built for analyzing and extracting insights from text, such as identifying sentiment, entities, and key phrases. It does not have the capability to generate SQL queries from natural language input. Therefore, it does not meet the company’s need for text-to-SQL conversion.
ResNet
Residual Neural Network
ResNet is a deep neural network architecture used mainly in computer vision tasks, such as image classification and object detection. It is not capable of handling natural language input or generating text-based outputs like SQL queries, making it irrelevant to the company’s needs.
WaveNet
WaveNet is a deep generative model created by DeepMind to synthesize audio data, particularly for generating realistic-sounding speech. It is not built to handle text input or produce SQL queries, making it completely unsuitable for this task.
Amazon SageMaker Data Wrangler - Use Case
Fix bias by balancing the dataset
When the number of samples in the majority class (bigger) is considerably larger than the number of samples in the minority (smaller) class, the dataset is considered imbalanced. This skew is challenging for ML algorithms and classifiers because the training process tends to be biased towards the majority class. Data Wrangler supports several balancing operators as part of the Balance data transform.
Incorrect options:
Monitor the quality of a model - This option is incorrect because monitoring model quality is a feature of SageMaker Model Monitor, not SageMaker Data Wrangler. SageMaker Model Monitor is designed to track model quality as well as performance in production.
Build ML models with no code - SageMaker Data Wrangler is not designed for building machine learning models without coding. SageMaker Canvas, another tool in the SageMaker suite, specifically targets no-code model building, allowing users to create and deploy models using a visual interface.
Store and share the features used for model development - SageMaker Feature Store is specifically designed to store and share machine learning features. It allows data scientists and engineers to create a centralized, consistent, and standardized set of features that can be easily accessed and reused across different teams and projects, making it the ideal choice for sharing features during model development. SageMaker Data Wrangler is not designed for this use case.
Reference:
https://aws.amazon.com/blogs/machine-learning/balance-your-data-for-machine-learning-with-amazon-sagemaker-data-wrangler/
Amazon SageMaker Data Wrangler
Amazon SageMaker Data Wrangler
You can split a machine learning (ML) dataset into train, test, and validation datasets with Amazon SageMaker Data Wrangler.
Data used for ML is typically split into the following datasets:
Training – Used to train an algorithm or ML model. The model iteratively uses the data and learns to provide the desired result.
Validation – Introduces new data to the trained model. You can use a validation set to periodically measure model performance as it trains and also tune any hyperparameters of the model. However, validation datasets are optional.
Test – Used on the final trained model to assess its performance on unseen data. This helps determine how well the model generalizes.
Data Wrangler is a capability of Amazon SageMaker that helps data scientists and data engineers quickly and easily prepare data for ML applications using a visual interface. It contains over 300 built-in data transformations so you can quickly normalize, transform, and combine features without writing code.
References:
https://aws.amazon.com/blogs/machine-learning/create-train-test-and-validation-splits-on-your-data-for-machine-learning-with-amazon-sagemaker-data-wrangler/
Amazon SageMaker Clarify
SageMaker Clarify is used to evaluate models and explain the model predictions.
Amazon SageMaker Feature Store
Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models.
https://aws.amazon.com/sagemaker/feature-store/
Amazon SageMaker Ground Truth
Amazon SageMaker Ground Truth is a data labeling service provided by AWS that enables users to build highly accurate training datasets for machine learning quickly. The service helps automate the data labeling process through a combination of human labeling and machine learning.
https://aws.amazon.com/sagemaker/groundtruth/
Context window
The context window defines how much text (measured in tokens) the AI model can process at one time to generate a coherent output. It determines the limit of input data that the model can use to understand context, maintain conversation history, or generate relevant responses. The context window is measured in tokens (units of text), not characters, making it the key concept for understanding data processing limits in AI models.
via - https://aws.amazon.com/blogs/security/context-window-overflow-breaking-the-barrier/
Character count
Character count measures the number of characters in a piece of text, but AI models typically do not limit their input based on characters alone. Instead, they rely on tokens, which can represent words, subwords, or punctuation marks. The concept that defines how much text can be processed at one time is the context window, which is measured in tokens, not character count.
Tokens
While tokens are the individual units of text that the model processes, the concept that describes the total amount of text the model can handle at one time is the context window, not tokens themselves. Tokens are components within the context window, and the model’s capacity is defined by how many tokens can fit within this window, rather than just the tokens themselves.
Embeddings
Embeddings are vector representations that encode the semantic meaning of words or phrases, enabling the AI model to understand relationships and context in text data. However, embeddings do not define the amount of text or the number of characters considered at one time; they are a representation technique used within the model once the text is processed.
Amazon SageMaker Model Dashboard
Amazon SageMaker Model Dashboard is a centralized repository of all models created in your account. The models are generally the outputs of SageMaker training jobs, but you can also import models trained elsewhere and host them on SageMaker. Model Dashboard provides a single interface for IT administrators, model risk managers, and business leaders to track all deployed models and aggregate data from multiple AWS services to provide indicators about how your models are performing.
Model risk managers, ML practitioners, data scientists, and business leaders can get a comprehensive overview of models using the Model Dashboard. The dashboard aggregates and displays data from Amazon SageMaker Model Cards, Endpoints, and Model Monitor services to display valuable information such as model metadata from the model card and model registry, endpoints where the models are deployed, and insights from model monitoring.
https://docs.aws.amazon.com/sagemaker/latest/dg/model-dashboard-faqs.html
Amazon SageMaker JumpStart
Amazon SageMaker JumpStart is a machine learning (ML) hub that can help you accelerate your ML journey. With SageMaker JumpStart, you can evaluate, compare, and select Foundation Models (FMs) quickly based on pre-defined quality and responsibility metrics to perform tasks like article summarization and image generation. Pretrained models are fully customizable for your use case with your data, and you can easily deploy them into production with the user interface or SDK.
Amazon SageMaker Feature Store
Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference. For example, in an application that recommends a music playlist, features could include song ratings, listening duration, and listener demographics.
Amazon SageMaker Data Wrangler
Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare tabular and image data for ML from weeks to minutes. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow (including data selection, cleansing, exploration, visualization, and processing at scale) from a single visual interface.
SageMaker Data Wrangler is a tool designed for data preparation and feature engineering in the machine learning pipeline. It allows users to clean, transform, and process data but does not offer features for creating interactive visualizations or dashboards. Therefore, it is not suitable for the company’s need to visualize sales data for business intelligence purposes.
https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-analyses.html
Amazon QuickSight
Amazon QuickSight, a business intelligence (BI) service that allows users to easily create and share interactive dashboards and visualizations from various data sources, including up-to-date sales data, enabling real-time insights and reporting
This is the correct option because Amazon QuickSight is specifically designed for creating interactive visualizations and dashboards for a wide range of data sources, including sales data. It provides an easy-to-use interface for business intelligence tasks, enabling the company to quickly generate insights and monitor trends. QuickSight also supports real-time data analysis, making it ideal for up-to-date reporting on sales performance over the last 12 months.
https://docs.aws.amazon.com/quicksight/latest/user/working-with-visual-types.html
CloudWatch Dashboard
CloudWatch Dashboards are primarily used for monitoring AWS infrastructure and services, such as server metrics, application logs, and performance monitoring. It is not designed for creating visualizations or dashboards for sales data or other business metrics, and therefore, does not meet the company’s requirement for business intelligence and reporting.
SageMaker Canvas
SageMaker Canvas is focused on enabling users to build and deploy machine learning models without coding. It is not a tool for data visualization or creating business dashboards. While it can help with data analysis through machine learning, it does not provide the capabilities required for creating interactive visualizations or dashboards for sales data.
Decision Trees
Decision Trees are highly interpretable models that provide a clear and straightforward visualization of the decision-making process. Decision Trees work by splitting the data into subsets based on the most significant features, resulting in a tree-like structure where each branch represents a decision rule. This makes it easy to understand how different characteristics of movies contribute to the final classification, making Decision Trees the most suitable choice for this task. So, Decision Trees offer high interpretability and transparency, which aligns with the company’s need to document the inner mechanisms of how the model affects the output.
via - https://docs.aws.amazon.com/whitepapers/latest/model-explainability-aws-ai-ml/interpretability-versus-explainability.html
https://docs.aws.amazon.com/whitepapers/latest/model-explainability-aws-ai-ml/interpretability-versus-explainability.html
Logistic Regression
Logistic Regression is primarily designed for binary classification problems. While it can be adapted for multiclass classification, it may not perform effectively with a large number of categories or a complex dataset like a massive movie database. Additionally, logistic regression does not provide an easily interpretable structure that illustrates how each feature influences the final output, making it less suitable for the company’s requirements.
Neural Networks
Although neural networks are powerful tools for handling large and complex datasets, they are often considered “black-box” models due to their lack of transparency. Neural networks involve multiple layers of neurons and nonlinear transformations, making it difficult to understand and document the inner workings of the model. Given the company’s need for transparency and an understanding of how the model affects the output, neural networks are not the best choice.
Support Vector Machines (SVMs)
While SVMs are effective for classification tasks, especially in high-dimensional spaces, they do not inherently provide an interpretable way to understand the decision-making process. SVMs create a hyperplane to separate classes, but it is not straightforward to explain how individual features impact the final classification. This lack of interpretability makes SVMs less suitable for a company that wants to document and understand the inner workings of the model.
Amazon OpenSearch Service
Amazon OpenSearch Service, which is designed to provide fast search capabilities and supports full-text search, indexing, and similarity scoring
Amazon OpenSearch Service is the most suitable choice because it is specifically built to handle search and analytics workloads, including fast index lookups and similarity scoring. OpenSearch supports full-text search, vector search, and advanced data indexing, which are essential for the Retrieval-Augmented Generation (RAG) framework. It enables the chatbot or model to quickly find and rank relevant documents based on their similarity to the query, making it highly effective for applications that require rapid data retrieval and relevance ranking.
via - https://aws.amazon.com/blogs/big-data/amazon-opensearch-services-vector-database-capabilities-explained/
Knowledge Bases for Amazon Bedrock
Knowledge Bases for Amazon Bedrock takes care of the entire ingestion workflow of converting your documents into embeddings (vector) and storing the embeddings in a specialized vector database. Knowledge Bases for Amazon Bedrock supports popular databases for vector storage, including vector engine for Amazon OpenSearch Serverless, Pinecone, Redis Enterprise Cloud, Amazon Aurora (coming soon), and MongoDB (coming soon). If you do not have an existing vector database, Amazon Bedrock creates an OpenSearch Serverless vector store for you.
https://aws.amazon.com/bedrock/knowledge-bases/
Amazon DocumentDB
Amazon DocumentDB (with MongoDB compatibility), a managed NoSQL document database service designed for storing semi-structured data to facilitate search capabilities
Amazon DocumentDB is primarily designed for storing and querying semi-structured JSON data. While it provides scalability and managed support for document-based workloads, it is not optimized for full-text search or similarity searches. DocumentDB lacks the native capabilities for efficient indexing and retrieval needed for RAG, making it a less suitable choice.
Amazon DynamoDB
Amazon DynamoDB, a fully managed NoSQL database service that offers low-latency data retrieval to handle fast index lookups as well as search operations
Amazon DynamoDB is a key-value and document database designed for fast and predictable performance with low latency, suitable for high-throughput transactional workloads. However, it does not natively support advanced search capabilities or similarity scoring needed for RAG applications. Its primary focus is on rapid data retrieval based on primary keys, not on the complex search and retrieval functions required for this scenario.
Amazon Aurora
Amazon Aurora, a managed relational database service that is optimized for high-performance transactional workloads that can be useful for search operations
Amazon Aurora is a high-performance relational database service that is excellent for OLTP (Online Transaction Processing) workloads. While it provides advanced indexing features for relational data, it is not optimized for full-text search, fast similarity lookups, or the types of search capabilities required for RAG applications. Aurora’s primary strengths lie in transactional integrity and scalability for relational datasets, not in search and retrieval tasks.
AWS Trainium instances
AWS Trainium instances are designed with energy efficiency in mind, providing optimal performance per watt for machine learning workloads. Trainium, AWS’s custom-designed machine learning chip, is specifically engineered to offer the best performance at the lowest power consumption, reducing the carbon footprint of training large-scale models. This makes Trainium instances the most environmentally friendly choice among the options listed. Trn1 instances powered by Trainium are up to 25% more energy efficient for DL training than comparable accelerated computing EC2 instances.
via - https://aws.amazon.com/machine-learning/trainium/
Accelerated Computing P type instances
Accelerated Computing P type instances, powered by high-end GPUs like NVIDIA Tesla, are optimized for maximum computational throughput, particularly for machine learning and HPC tasks. However, they consume significant amounts of power and are not specifically designed with energy efficiency in mind, making them less suitable for an environmentally conscious choice.
https://aws.amazon.com/ec2/instance-types/
Accelerated Computing G type instances
Accelerated Computing G type instances, such as those powered by NVIDIA GPUs, are designed for graphics-heavy applications like gaming, rendering, or video processing. While they offer high computational power for specific tasks, they are not specifically optimized for energy efficiency or low environmental impact, making them less suitable for a company focused on minimizing its carbon footprint.
https://aws.amazon.com/ec2/instance-types/
Compute Optimized C type instances
Compute Optimized C type instances are designed to maximize compute performance for applications such as web servers, gaming, and scientific modeling. While they provide excellent compute power, they are not optimized for energy efficiency in the same way as AWS Trainium instances, making them less ideal for reducing environmental impact.
https://aws.amazon.com/ec2/instance-types/
Amazon Polly
Amazon Polly is used to deploy high-quality, natural-sounding human voices in dozens of languages
Amazon Polly is a cloud service that converts text into lifelike speech. You can use Amazon Polly to develop applications that increase engagement and accessibility. Amazon Polly supports multiple languages and includes a variety of lifelike voices.
Amazon Comprehend
Amazon Comprehend service uses machine learning to find insights and relationships in the text
Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text, no machine learning experience is required. Amazon Comprehend uses machine learning to help you uncover the insights and relationships in your unstructured data.
Amazon Transcribe
Amazon Transcribe uses machine learning models to convert speech to text.
Amazon Lex
Amazon Lex is the AWS service used to build conversational interfaces for applications using voice and text.
Amazon Rekognition
Amazon Rekognition is a cloud-based image and video analysis service that makes it easy to add advanced computer vision capabilities to your applications.
Batch inference
The company should use batch inference, thereby allowing it to run multiple inference requests in a single batch
You can use batch inference to run multiple inference requests asynchronously, and improve the performance of model inference on large datasets. Amazon Bedrock offers select foundation models (FMs) from leading AI providers like Anthropic, Meta, Mistral AI, and Amazon for batch inference at 50% of on-demand inference pricing.
Batch inference is the most cost-effective choice when reducing inference costs on Amazon Bedrock. By processing large numbers of data points in a single batch, the company can lower the cost per inference as the model handles multiple requests simultaneously. This approach is ideal when there is no need for immediate responses, allowing for more efficient use of resources and minimizing computational expenses.
https://docs.aws.amazon.com/bedrock/latest/userguide/inference.html
https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-bedrock-fms-batch-inference-50-price/
https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-deployment.html
The company should use batch inference, which processes multiple data points at once in large batches, suitable for processing large datasets in a single operation when immediate real-time responses are not required
Batch inference is the most suitable choice for processing a large payload of several gigabytes with Amazon SageMaker when there is no need for immediate responses. This method allows the company to run predictions on large volumes of data in a single batch job, which is more cost-effective and efficient than processing individual requests in real-time. Batch inference can handle large datasets and is ideal for scenarios where waiting for the responses is acceptable, making it the best fit for this use case.
SageMaker Batch Transform will automatically split your input file of several gigabytes (GBs) into whatever payload size is specified if you use “SplitType”: “Line” and “BatchStrategy”: “MultiRecord”.
https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-deployment.html
https://repost.aws/questions/QUlefH1ni4QOaulUT4870D5g/sagemaker-batch-transform
Real-time inference
https://docs.aws.amazon.com/bedrock/latest/userguide/inference.html
Real-time inference is optimized for scenarios where low latency is crucial, and responses are needed immediately. It is not suitable for processing large payloads of several gigabytes.
Serverless inference
https://docs.aws.amazon.com/bedrock/latest/userguide/inference.html
Serverless inference provides automatic scaling and is ideal for unpredictable traffic patterns or sporadic workloads, but it is not specifically designed for handling large, continuous payloads efficiently.
On-demand inference
On-demand inference offers flexibility by charging only for the resources used during each inference, making it suitable for unpredictable or variable usage patterns. However, it is generally more costly when used frequently or over long periods because it does not benefit from cost savings associated with bulk processing. For a company looking to reduce costs, on-demand inference may not be the most economical option.
Asynchronous inference
Asynchronous inference is designed to handle longer-running tasks and large payloads by processing them in the background, this option is ideal for requests with large payload sizes (up to 1GB), long processing times (up to one hour), and near real-time latency requirements. Asynchronous Inference enables you to save on costs by autoscaling the instance count to zero when there are no requests to process, so you only pay when your endpoint is processing requests.
https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference.html
Top P
Influences the percentage of most-likely candidates that the model considers for the next token
Top P represents the percentage of most likely candidates that the model considers for the next token. Choose a lower value to decrease the size of the pool and limit the options to more likely outputs. Choose a higher value to increase the size of the pool and allow the model to consider less likely outputs.
via - https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html
The percentage of most-likely candidates that the model considers for the next token.
Choose a lower value to decrease the size of the pool and limit the options to more likely outputs.
Choose a higher value to increase the size of the pool and allow the model to consider less likely outputs.
In technical terms, the model computes the cumulative probability distribution for the set of responses and considers only the top P% of the distribution.
For example, if you choose a value of 0.8 for Top P, the model selects from the top 80% of the probability distribution of tokens that could be next in the sequence.
Stop sequences
The inference parameter Stop sequences specifies the sequences of characters that stop the model from generating further tokens. If the model generates a stop sequence that you specify, it will stop generating after that sequence.
Top K
The inference parameter Top K represents the number of most likely candidates that the model considers for the next token.
The number of most-likely candidates that the model considers for the next token.
Choose a lower value to decrease the size of the pool and limit the options to more likely outputs.
Choose a higher value to increase the size of the pool and allow the model to consider less likely outputs.
For example, if you choose a value of 50 for Top K, the model selects from 50 of the most probable tokens that could be next in the sequence.
Temperature
The inference parameter Temperature is a value between 0 and 1, and it regulates the creativity of the model’s responses.
Affects the shape of the probability distribution for the predicted output and influences the likelihood of the model selecting lower-probability outputs.
Choose a lower value to influence the model to select higher-probability outputs.
Choose a higher value to influence the model to select lower-probability outputs.
In technical terms, the temperature modulates the probability mass function for the next token. A lower temperature steepens the function and leads to more deterministic responses, and a higher temperature flattens the function and leads to more random responses.
Linear regression
Linear regression refers to supervised learning models that, based on one or more inputs, predict a value from a continuous scale. An example of linear regression is predicting a house price. You could predict a house’s price based on its location, age, and number of rooms after you train a model on a set of historical sales training data with those variables.
Neural network
A neural network solution is a more complex supervised learning technique. To produce a given outcome, it takes some given inputs and performs one or more layers of mathematical transformation based on adjusting data weightings. An example of a neural network technique is predicting a digit from a handwritten image.
Document classification
Document classification is an example of semi-supervised learning. Semi-supervised learning is when you apply both supervised and unsupervised learning techniques to a common problem. This technique relies on using a small amount of labeled data and a large amount of unlabeled data to train systems. When applying categories to a large document base, there may be too many documents to physically label. For example, these could be countless reports, transcripts, or specifications. Training on the unlabeled data helps identify similar documents for labeling.
Association rule learning
This is an example of unsupervised learning. Association rule learning techniques uncover rule-based relationships between inputs in a dataset. For example, the Apriori algorithm conducts market basket analysis to identify rules like coffee and milk often being purchased together.
Clustering
Clustering is an unsupervised learning technique that groups certain data inputs, so they may be categorized as a whole. There are various types of clustering algorithms depending on the input data. An example of clustering is identifying different types of network traffic to predict potential security incidents.
Benchmark datasets
Benchmark datasets are the most suitable option for evaluating an LLM for bias and discrimination with the least administrative effort. These datasets are specifically designed and curated to include a variety of scenarios that test for potential biases in model outputs. They are pre-existing and standardized, meaning that the company does not need to spend time or resources creating or manually curating data. Using these datasets allows for a quick, cost-effective, and consistent evaluation of model fairness across different contexts.
https://docs.aws.amazon.com/bedrock/latest/userguide/model-evaluation-prompt-datasets-builtin.html
https://docs.aws.amazon.com/bedrock/latest/userguide/model-evaluation-prompt-datasets.html
Human-monitored benchmarking
Human-monitored benchmarking involves a team of human reviewers who manually assess the model’s outputs for bias. While this approach can provide nuanced feedback, it requires substantial administrative effort to coordinate, train, and manage human reviewers. It is labor-intensive and costly, and the potential for human error or subjective judgment may lead to inconsistent evaluations. Therefore, it is not the most efficient option if the goal is to minimize administrative overhead.
User-generated data
Randomly selected user-generated data involves analyzing real user interactions for bias, but this approach is not ideal due to the lack of standardization. It requires considerable effort to manually select, curate, and evaluate the data for bias, and there is a risk that the selected samples may not cover all relevant bias scenarios comprehensively. This method also involves privacy and ethical considerations, adding further complexity and administrative effort.
Internally generated synthetic data
Internally generated synthetic data allows for custom scenarios to be tested, but it requires a significant investment in resources, expertise, and time to create and maintain these datasets. Designing synthetic data that accurately reflects real-world biases and discrimination scenarios is complex, making it an impractical choice when aiming to minimize administrative effort.
SageMaker Feature Store
SageMaker Feature Store is specifically designed to store and manage machine learning features. It allows data scientists and engineers to create a centralized, consistent, and standardized set of features that can be easily accessed and reused across different teams and projects, making it the ideal choice for sharing variables during model development. SageMaker Feature Store also supports feature versioning and governance, which helps maintain the integrity and accuracy of the data used in model development.
https://aws.amazon.com/sagemaker/feature-store/
SageMaker Data Wrangler
SageMaker Data Wrangler - SageMaker Data Wrangler is primarily a tool for data preparation and feature engineering, not for storing or sharing features across different teams. While Data Wrangler provides capabilities for cleaning, transforming, and visualizing data, it does not offer the functionality needed to maintain a centralized repository for sharing variables during model development.
https://aws.amazon.com/sagemaker/data-wrangler/
SageMaker Clarify
SageMaker Clarify is focused on detecting bias in data and explaining model predictions to ensure transparency and fairness. It does not provide a mechanism for storing or sharing features and is not designed to support collaborative feature management, making it unsuitable for the company’s goal of sharing variables for model development.
SageMaker Model Monitor
SageMaker Model Monitor is designed to track the performance of machine learning models in production by monitoring data drift, bias, and other deviations. It does not offer any features for storing or sharing variables during the development phase, and its purpose is primarily focused on post-deployment model monitoring rather than feature management.
Good Prompting technique
The following are the constituents of a good prompting technique:
(1) Instructions – a task for the model to do (description, how the model should perform)
(2) Context – external information to guide the model
(3) Input data – the input for which you want a response
(4) Output Indicator – the output type or format
via - https://aws.amazon.com/what-is/prompt-engineering/
Hyperparameters
Hyperparameters are values that can be adjusted for model customization to control the training process and, consequently, the output custom model. In other words, hyperparameters are external configurations set before the training process begins. They control the training process and the structure of the model but are not adjusted by the training algorithm itself. Examples include the learning rate, the number of layers in a neural network, etc.
Model parameters
Model parameters are values that define a model and its behavior in interpreting input and generating responses. Model parameters are controlled and updated by providers. You can also update model parameters to create a new model through the process of model customization. In other words, Model parameters are the internal variables of the model that are learned and adjusted during the training process. These parameters directly influence the output of the model for a given input. Examples include the weights and biases in a neural network.
RAG
Retrieval-Augmented Generation
Utilize a Retrieval-Augmented Generation (RAG) system by indexing all product catalog PDFs and configuring the LLM chatbot to reference this system for answering queries
Using a RAG approach is the least costly and most efficient solution for providing up-to-date and relevant responses. In this approach, you convert all product catalog PDFs into a searchable knowledge base. When a customer query comes in, the RAG framework first retrieves the most relevant pieces of information from this knowledge base and then uses an LLM to generate a coherent response based on the retrieved context. This method does not require re-training the model or modifying every incoming query with large datasets, making it significantly more cost-effective. It ensures that the chatbot always has access to the most recent information without needing expensive updates or processing every time.
https://aws.amazon.com/what-is/retrieval-augmented-generation/
https://aws.amazon.com/bedrock/knowledge-bases/
Foundation Models
Foundation Models serve as a broad base for various AI applications by providing generalized capabilities, whereas Large Language Models are specialized for understanding and generating human language
Foundation Models provide a broad base with generalized capabilities that can be applied to various tasks such as natural language processing (NLP), question answering, and image classification. The size and general-purpose nature of FMs make them different from traditional ML models, which typically perform specific tasks, like analyzing text for sentiment, classifying images, and forecasting trends.
Generally, an FM uses learned patterns and relationships to predict the next item in a sequence. For example, with image generation, the model analyzes the image and creates a sharper, more clearly defined version of the image. Similarly, with text, the model predicts the next word in a string of text based on the previous words and their context. It then selects the next word using probability distribution techniques.
In contrast, Large Language Models are specifically designed for tasks involving the understanding and generation of human language, making them more specialized. LLMs are specifically focused on language-based tasks such as summarization, text generation, classification, open-ended conversation, and information extraction.
https://aws.amazon.com/what-is/foundation-models/
Large Language Model
Large language models, also known as LLMs, are very large deep learning models that are pre-trained on vast amounts of data. The underlying transformer is a set of neural networks that consist of an encoder and a decoder with self-attention capabilities. The encoder and decoder extract meanings from a sequence of text and understand the relationships between words and phrases in it.
https://aws.amazon.com/what-is/large-language-model/
Tokens
Tokens are the correct answer because they represent the fundamental units of text that the AI model processes. Tokens can be whole words, parts of words (sub-words), or even single characters, depending on the model’s tokenization strategy. In generative AI, the model breaks down text into these tokens to better understand the structure, meaning, and context, enabling it to generate coherent language outputs.
Embeddings
Embeddings are not the correct answer because they are a way of representing tokens (words, sub-words, or phrases) as numerical vectors to capture their semantic relationships in a high-dimensional space. Embeddings help the model understand the meaning and context of tokens, but they are not the units of text themselves.
Vectors
Vectors are mathematical constructs used to represent the relationships between different words or tokens in a model. While vectors are crucial for understanding how words are related to each other in the embedding space, they do not directly represent the units of text (tokens) processed by the model.
Context window
The context window is the total amount of text (measured in tokens) that a model can process at once. It does not refer to the individual units of text, like words or sub-words, but rather to the overall capacity for text input. Therefore, it is not the correct answer for identifying the basic units of text that the model handles.
Reinforcement learning
Reinforcement learning involves an agent interacting with an environment by taking actions and receiving rewards or penalties, learning a policy to maximize cumulative rewards over time
Reinforcement learning works by having an agent take actions in an environment, receiving rewards or penalties based on the actions, and learning a policy that aims to maximize cumulative rewards over time. This process involves continuously adjusting actions based on the feedback received to improve performance.
https://aws.amazon.com/what-is/reinforcement-learning/
Reinforcement learning does not use supervised learning algorithms to label data. Rather, it focuses on learning from interaction with the environment.
Reinforcement learning is not an unsupervised learning technique and does not cluster data points without feedback.
While data transformation can be part of feature engineering, reinforcement learning specifically involves learning optimal actions based on feedback from the environment rather than transforming data into a new feature space.
Amazon SageMaker JumpStart - Key features
(1) You can evaluate, compare, and select Foundation Models quickly based on pre-defined quality and responsibility metrics
(2) Pre-trained models are fully customizable for your use case with your data
Amazon SageMaker JumpStart is a machine learning (ML) hub that can help you accelerate your ML journey. With SageMaker JumpStart, you can evaluate, compare, and select FMs quickly based on pre-defined quality and responsibility metrics to perform tasks like article summarization and image generation. Pretrained models are fully customizable for your use case with your data, and you can easily deploy them into production with the user interface or SDK. You can also share artifacts, including models and notebooks, within your organization to accelerate model building and deployment, and admins can control which models are visible to users within their organization.
Your inference and training data will not be used nor shared to update or train the base model that SageMaker JumpStart surfaces to customers.
SageMaker JumpStart provides proprietary and public models.
Amazon SageMaker Canvas provides a no-code interface, in which you can create highly accurate machine learning models —without any machine learning experience or writing a single line of code.
https://aws.amazon.com/sagemaker/jumpstart/
https://aws.amazon.com/sagemaker/faqs/
Amazon SageMaker Ground Truth
To train a machine learning model, you need a large, high-quality, labeled dataset. Ground Truth helps you build high-quality training datasets for your machine learning models. With Ground Truth, you can use workers from either Amazon Mechanical Turk, a vendor company that you choose, or an internal, private workforce along with machine learning to enable you to create a labeled dataset. You can use the labeled dataset output from Ground Truth to train your models. You can also use the output as a training dataset for an Amazon SageMaker model.
Depending on your ML application, you can choose from one of the Ground Truth built-in task types to have workers generate specific types of labels for your data. You can also build a custom labeling workflow to provide your UI and tools to workers labeling your data. You can choose your workforce from:
The Amazon Mechanical Turk workforce of over 500,000 independent contractors worldwide.
A private workforce that you create from your employees or contractors for handling data within your organization.
A vendor company that you can find in the AWS Marketplace that specializes in data labeling services.
https://aws.amazon.com/sagemaker/groundtruth/
Amazon SageMaker Feature Store
Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference. For example, in an application that recommends a music playlist, features could include song ratings, listening duration, and listener demographics.
Amazon SageMaker JumpStart
Amazon SageMaker JumpStart is a machine learning (ML) hub that can help you accelerate your ML journey. With SageMaker JumpStart, you can evaluate, compare, and select Foundation Models (FMs) quickly based on pre-defined quality and responsibility metrics to perform tasks like article summarization and image generation. Pretrained models are fully customizable for your use case with your data, and you can easily deploy them into production with the user interface or SDK.
Amazon SageMaker Canvas
SageMaker Canvas offers a no-code interface that can be used to create highly accurate machine learning models —without any machine learning experience or writing a single line of code. SageMaker Canvas provides access to ready-to-use models including foundation models from Amazon Bedrock or Amazon SageMaker JumpStart or you can build your custom ML model using AutoML powered by SageMaker AutoPilot.
Amazon Bedrock Guardrails:
The company should instruct the model to stick to the prompt by adding explicit instructions to ignore any unrelated or potentially malicious content
This is the correct approach because providing explicit instructions within the prompt helps guide the model’s behavior, reducing the likelihood of generating inappropriate or unsafe content. By clarifying what the model should focus on and what it should ignore, the company can enforce boundaries that align with its safety standards. This method is straightforward and leverages prompt engineering to mitigate risks effectively.
https://aws.amazon.com/bedrock/guardrails/
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-bedrock-guardrail.html
Amazon Transcribe Medical
Amazon Transcribe Medical is an automatic speech recognition (ASR) service that makes it easy for you to add medical speech-to-text capabilities to your voice-enabled applications. Conversations between health care providers and patients provide the foundation of a patient’s diagnosis and treatment plan and clinical documentation workflow. It’s critically important that this information is accurate. However, accurate medical transcriptions such as dictation recorders and scribes are expensive, time-consuming, and disruptive to the patient experience. Some organizations use existing medical transcription software but find them inefficient and low in quality.
Driven by state-of-the-art machine learning, Amazon Transcribe Medical accurately transcribes medical terminologies such as medicine names, procedures, and even conditions or diseases. Amazon Transcribe Medical can serve a diverse range of use cases such as transcribing physician-patient conversations for clinical documentation, capturing phone calls in pharmacovigilance, or subtitling telehealth consultations.
https://aws.amazon.com/transcribe/medical/
Amazon Transcribe
Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or add speech-to-text capabilities to any application. Amazon Transcribe is not specifically trained for medical terminologies or patient conditions and diseases. Hence, Amazon Transcribe Medical is optimal for this use case.
Amazon Rekognition
Amazon Rekognition is a cloud-based image and video analysis service that makes it easy to add advanced computer vision capabilities to your applications. The service is powered by proven deep learning technology and it requires no machine learning expertise to use. Amazon Rekognition includes a simple, easy-to-use API that can quickly analyze any image or video file that’s stored in Amazon S3. Rekognition is not an automatic speech recognition (ASR) service.
Amazon Polly
Amazon Polly uses deep learning technologies to synthesize natural-sounding human speech, so you can convert articles to speech. With dozens of lifelike voices across a broad set of languages, use Amazon Polly to build speech-activated applications. Amazon Polly enables existing applications to speak as a first-class feature and creates the opportunity for entirely new categories of speech-enabled products, from mobile apps and cars to devices and appliances. Polly is not an automatic speech recognition (ASR) service.
Amazon Bedrock
Amazon Bedrock is the easiest way to build and scale generative AI applications with foundation models. Amazon Bedrock is a fully managed service that makes foundation models from Amazon and leading AI startups available through an API, so you can choose from various FMs to find the model that’s best suited for your use case. With Bedrock, you can speed up developing and deploying scalable, reliable, and secure generative AI applications without managing infrastructure.
https://aws.amazon.com/bedrock/
Amazon SageMaker JumpStart
Amazon SageMaker JumpStart is a machine learning hub with foundation models, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks. With SageMaker JumpStart, you can access pre-trained models, including foundation models, to perform tasks like article summarization and image generation. Pretrained models are fully customizable for your use case with your data, and you can easily deploy them into production with the user interface or SDK.
https://aws.amazon.com/sagemaker/jumpstart/
Amazon Q
Amazon Q is a generative AI–powered assistant for accelerating software development and leveraging companies’ internal data. Amazon Q generates code, tests, and debugs. It has multistep planning and reasoning capabilities that can transform and implement new code generated from developer requests.
https://aws.amazon.com/q/
AWS Trainium
AWS Trainium is the machine learning (ML) chip that AWS purpose-built for deep learning (DL) training of 100B+ parameter models. Each Amazon Elastic Compute Cloud (Amazon EC2) Trn1 instance deploys up to 16 Trainium accelerators to deliver a high-performance, low-cost solution for DL training in the cloud.
https://aws.amazon.com/machine-learning/trainium/
AWS Inferentia
AWS Inferentia is an ML chip purpose-built by AWS to deliver high-performance inference at a low cost. AWS Inferentia accelerators are designed by AWS to deliver high performance at the lowest cost in Amazon EC2 for your deep learning (DL) and generative AI inference applications.
Knowledge Bases for Amazon Bedrock
With Knowledge Bases for Amazon Bedrock, you can give FMs and agents contextual information from your company’s private data sources for RAG to deliver more relevant, accurate, and customized responses
Knowledge Bases for Amazon Bedrock takes care of the entire ingestion workflow of converting your documents into embeddings (vector) and storing the embeddings in a specialized vector database. Knowledge Bases for Amazon Bedrock supports popular databases for vector storage, including vector engine for Amazon OpenSearch Serverless, Pinecone, Redis Enterprise Cloud, Amazon Aurora (coming soon), and MongoDB (coming soon). If you do not have an existing vector database, Amazon Bedrock creates an OpenSearch Serverless vector store for you.
via - https://aws.amazon.com/bedrock/knowledge-bases/
https://aws.amazon.com/bedrock/faqs/
Watermark detection for Amazon Bedrock
The watermark detection mechanism allows you to identify images generated by Amazon Titan Image Generator, a foundation model that allows users to create realistic, studio-quality images in large volumes and at low cost, using natural language prompts. With watermark detection, you can increase transparency around AI-generated content by mitigating harmful content generation and reducing the spread of misinformation. You cannot use a watermark detection mechanism to implement RAG workflow in Amazon Bedrock.
https://aws.amazon.com/about-aws/whats-new/2024/04/watermark-detection-amazon-titan-image-generator-bedrock/
Continued pretraining in Amazon Bedrock
In the continued pretraining process, you provide unlabeled data to pre-train a model by familiarizing it with certain types of inputs. You can provide data from specific topics to expose a model to those areas. The continued pretraining process will tweak the model parameters to accommodate the input data and improve its domain knowledge. You can use continued pretraining or fine-tuning for model customization in Amazon Bedrock. You cannot use continued pretraining to implement RAG workflow in Amazon Bedrock.
Guardrails for Amazon Bedrock
Guardrails for Amazon Bedrock help you implement safeguards for your generative AI applications based on your use cases and responsible AI policies. It helps control the interaction between users and FMs by filtering undesirable and harmful content, redacts personally identifiable information (PII), and enhances content safety and privacy in generative AI applications. You cannot use guardrails to implement RAG workflow in Amazon Bedrock.
Confusion matrix
Confusion matrix is a tool specifically designed to evaluate the performance of classification models by displaying the number of true positives, true negatives, false positives, and false negatives. This matrix provides a detailed breakdown of the model’s performance across all classes, making it the most suitable choice for evaluating a classification model’s accuracy and identifying potential areas for improvement. It provides a comprehensive overview of the model’s performance by detailing how many instances were correctly or incorrectly classified in each category. This enables the company to understand where the model is performing well and where it may need adjustments, such as improving the classification of specific material types.
https://docs.aws.amazon.com/machine-learning/latest/dg/multiclass-model-insights.html
Root Mean Squared Error (RMSE)
Root Mean Squared Error (RMSE) is a metric commonly used to measure the average error in regression models by calculating the square root of the average squared differences between predicted and actual values. However, RMSE is not suitable for classification tasks, as it is designed to measure continuous outcomes, not discrete class predictions.
Mean Absolute Error (MAE)
Mean Absolute Error (MAE) measures the average magnitude of errors in a set of predictions without considering their direction. MAE is typically used in regression tasks to quantify the accuracy of a continuous variable’s predictions, not for classification tasks where the outputs are categorical rather than continuous.
Correlation matrix
Correlation matrix measures the statistical correlation between different variables or features in a dataset, typically used to understand the relationships between continuous variables. A correlation matrix is not designed to evaluate the performance of a classification model, as it does not provide any insight into the accuracy or errors of categorical predictions.
Inference
Inference refers to the stage where a trained machine learning model is deployed to make predictions or generate outputs based on new input data. During inference, the model uses the patterns and relationships it learned during training to provide accurate and meaningful results. In this scenario, the user sends input data to the SageMaker model, which then performs inference to generate the corresponding output or prediction.
https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html
https://aws.amazon.com/blogs/machine-learning/create-train-test-and-validation-splits-on-your-data-for-machine-learning-with-amazon-sagemaker-data-wrangler/
Training (ML Model)
Training is the process of teaching a machine learning model to recognize patterns by adjusting its internal parameters based on a labeled dataset. During training, the model learns from data by minimizing errors and improving accuracy. However, the scenario described does not involve modifying the model’s parameters; it only involves using the trained model to make predictions, making “training” an incorrect choice.
Validation
Validation is a step used to evaluate and fine-tune the model during the training process by checking its performance on a validation dataset, which is separate from the training dataset. The purpose is to optimize the model’s hyperparameters and prevent overfitting. Since the scenario involves using the model to predict outcomes from new input data, rather than evaluating or fine-tuning it, “validation” is not the correct term.
Testing
Testing is the final evaluation phase of a model, where its performance is assessed on an unseen test dataset after the training and validation phases are complete. It provides an unbiased estimate of the model’s generalization ability to new data. However, in the given scenario, the focus is on generating predictions from a model already trained, rather than testing its performance, so “testing” is not the correct answer.
Amazon Kendra
Amazon Kendra is a highly accurate and easy-to-use enterprise search service that’s powered by machine learning (ML). It allows developers to add search capabilities to their applications so their end users can discover information stored within the vast amount of content spread across their company. This includes data from manuals, research reports, FAQs, human resources (HR) documentation, and customer service guides, which may be found across various systems such as Amazon Simple Storage Service (S3), Microsoft SharePoint, Salesforce, ServiceNow, RDS databases, or Microsoft OneDrive.
When you type a question, the service uses ML algorithms to understand the context and return the most relevant results, whether that means a precise answer or an entire document. For example, you can ask a question such as “How much is the cash reward on the corporate credit card?” and Amazon Kendra will map to the relevant documents and return a specific answer (such as “2%”). Kendra provides sample code so you can get started quickly and easily integrate highly accurate searches into your new or existing applications.
https://aws.amazon.com/kendra/faqs/
Amazon Textract
Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, layout elements, and data from scanned documents. It goes beyond simple optical character recognition (OCR) to identify, understand, and extract specific data from documents. Textract is not a search service.