AI Practice Test #2 Flashcards

Question

AWS Audit Manager

Answer 1

AWS Audit Manager helps automate the collection of evidence to continuously audit your AWS usage. It simplifies the process of assessing risk and compliance with regulations and industry standards, making it an essential tool for governance in AI systems.

Answer 2

AWS Artifact provides on-demand access to AWS’ compliance reports and online agreements. It is useful for obtaining compliance documentation but does not provide continuous auditing or automated evidence collection.

Answer 3

AWS Trusted Advisor offers guidance to help optimize your AWS environment for cost savings, performance, security, and fault tolerance. While it provides recommendations for best practices, it does not focus on auditing or evidence collection for compliance.

Answer 4

AWS CloudTrail records AWS API calls for auditing purposes and delivers log files for compliance and operational troubleshooting. It is crucial for tracking user activity but does not automate compliance assessments or evidence collection.

Answer 5

Amazon Titan foundation models, developed by Amazon Web Services (AWS), are pre-trained on extensive datasets, making them robust and versatile models suitable for a wide range of applications. Amazon Titan foundation models (FMs) provide customers with a breadth of high-performing image, multimodal, and text model choices, via a fully managed API. Amazon Titan models are created by AWS and pretrained on large datasets, making them powerful, general-purpose models built to support a variety of use cases, while also supporting the responsible use of AI.

Answer 6

Llama is a series of large language models trained on publicly available data. They are built on the transformer architecture, enabling them to handle input sequences of any length and produce output sequences of varying lengths. A notable feature of Llama models is their capacity to generate coherent and contextually appropriate text.

Answer 7

Jurassic family of models from AI21 Labs supported use cases such as question answering, summarization, draft generation, advanced information extraction, and ideation for tasks requiring intricate reasoning and logic.

Answer 8

Claude is Anthropic’s frontier, state-of-the-art large language model that offers important features for enterprises like advanced reasoning, vision analysis, code generation, and multilingual processing.

Answer 9

Amazon SageMaker Model Dashboard is a centralized repository of all models created in your account. The models are generally the outputs of SageMaker training jobs, but you can also import models trained elsewhere and host them on SageMaker. Model Dashboard provides a single interface for IT administrators, model risk managers, and business leaders to track all deployed models and aggregate data from multiple AWS services to provide indicators about how your models are performing. You can view details about model endpoints, batch transform jobs, and monitoring jobs for additional insights into model performance. The dashboard’s visual display helps you quickly identify which models have missing or inactive monitors, so you can ensure all models are periodically checked for data drift, model drift, bias drift, and feature attribution drift. Lastly, the dashboard’s ready access to model details helps you dive deep, so you can access logs, infrastructure-related information, and resources to help you debug monitoring failures.

Answer 10

Amazon SageMaker Model Monitor monitors the quality of Amazon SageMaker machine learning models in production. With Model Monitor, you can set up: Continuous monitoring with a real-time endpoint, Continuous monitoring with a batch transform job that runs regularly, and On-schedule monitoring for asynchronous batch transform jobs.

Answer 11

Amazon SageMaker Ground Truth offers the most comprehensive set of human-in-the-loop capabilities, allowing you to harness the power of human feedback across the ML lifecycle to improve the accuracy and relevancy of models. You can complete a variety of human-in-the-loop tasks with SageMaker Ground Truth, from data generation and annotation to model review, customization, and evaluation, either through a self-service or an AWS-managed offering.

Answer 12

SageMaker Clarify helps identify potential bias during data preparation without writing code. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features.

Answer 13

A multimodal model can accept a mix of input types such as audio/text and create a mix of output types such as video/image A multimodal model is an artificial intelligence system designed to process and understand multiple types of data, such as text, images, audio, and video. Unlike unimodal models, which handle a single type of data, multimodal models can integrate and make sense of information from various sources, allowing them to perform more complex and versatile tasks. Multimodal models represent a significant advancement in AI, enabling the integration and understanding of multiple types of data. By combining different modalities, these models can perform a wide range of complex tasks, making them highly versatile and powerful tools in various fields.

Answer 14

Automatically extract printed text, handwriting, layout elements, and data from any document. Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, layout elements, and data from scanned documents. It goes beyond simple optical character recognition (OCR) to identify, understand, and extract specific data from documents.

Answer 15

Forecast business outcomes easily and accurately using machine learning. Amazon Forecast uses machine learning (ML) to generate more accurate demand forecasts with just a few clicks, without requiring any prior ML experience. Amazon Forecast uses ML to learn not only the best algorithm for each item but the best ensemble of algorithms for each item, automatically creating the best model for your data.

Answer 16

Easy-to-use enterprise search service that’s powered by machine learning. Amazon Kendra is a highly accurate and easy-to-use enterprise search service that’s powered by machine learning (ML). It allows developers to add search capabilities to their applications so their end users can discover information stored within the vast amount of content spread across their company.

Answer 17

Fine-tuning changes the weights of the FM. Fine-tuning a pre-trained foundation model is an affordable way to take advantage of their broad capabilities while customizing a model on your own small, corpus. Fine-tuning is a customization method that involved further training and does change the weights of your model.

Answer 18

Retrieval-augmented generation (RAG) does not change the weights of the FM. Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Large Language Models (LLMs) are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. RAG extends the already powerful capabilities of LLMs to specific domains or an organization's internal knowledge base, all without the need to retrain the model. Retrieval Augmented Generation (RAG) allows you to customize a model’s responses when you want the model to consider new knowledge or up-to-date information. When your data changes frequently, like inventory or pricing, it’s not practical to fine-tune and update the model while it’s serving user queries.

Answer 19

Prompt engineering does NOT change the weights of the FM. Another recommended way to first customize a foundation model to a specific use case is through prompt engineering. Providing your foundation model with well-engineered, context-rich prompts can help achieve desired results without any fine-tuning or changing of model weights.

Answer 20

Describes how a model should be used in a production environment Use Amazon SageMaker Model Cards to document critical details about your machine learning (ML) models in a single place for streamlined governance and reporting. Catalog details such as the intended use and risk rating of a model, training details and metrics, evaluation results and observations, and additional call-outs such as considerations, recommendations, and custom information. Model cards provide prescriptive guidance on what information to document and include fields for custom information. Specifying the intended uses of a model helps ensure that model developers and users have the information they need to train or deploy the model responsibly. The intended uses of a model go beyond technical details and describe how a model should be used in production, the scenarios in which is appropriate to use a model, and additional considerations such as the type of data to use with the model or any assumptions made during development.

Answer 21

Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, layout elements, and data from scanned documents. It goes beyond simple optical character recognition (OCR) to identify, understand, and extract specific data from documents. Today, many companies manually extract data from scanned documents such as PDFs, images, tables, and forms, or through simple OCR software that requires manual configuration (which often must be updated when the form changes). To overcome these manual and expensive processes, Textract uses ML to read and process any type of document, accurately extracting text, handwriting, tables, and other data with no manual effort. You can use one of AWS's pre-trained or custom features to quickly automate document processing, whether you’re automating loan processing or extracting information from invoices and receipts. Textract provides you the ability to customize the pre-trained features to meet the document processing needs specific to your business. Textract can extract the data in minutes instead of hours or days. Textract use cases: via - https://docs.aws.amazon.com/textract/latest/dg/what-is.html

Answer 22

Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or add speech-to-text capabilities to any application.

Answer 23

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find meaning and insights in text. Natural Language Processing (NLP) is a way for computers to analyze, understand, and derive meaning from textual information in a smart and useful way. By utilizing NLP, you can extract important phrases, sentiments, syntax, key entities such as brand, date, location, person, etc., and the language of the text.

Answer 24

Amazon Rekognition is a cloud-based image and video analysis service that makes it easy to add advanced computer vision capabilities to your applications. The service is powered by proven deep learning technology and it requires no machine learning expertise to use. Amazon Rekognition includes a simple, easy-to-use API that can quickly analyze any image or video file that’s stored in Amazon S3. While Rekognition can be used to extract text from images, Rekognition specializes in identifying text located spatially within an image, for instance, words displayed on street signs, t-shirts, or license plates. It's not the ideal choice for images containing more than 100 words, as this exceeds its limitation.

Answer 25

Temperature is a value between 0 and 1, and it regulates the creativity of the model's responses. Use a lower temperature if you want more deterministic responses. Use a higher temperature if you want creative or different responses for the same prompt on Amazon Bedrock and this is how you might see hallucination responses. A lower value of temperature results in deterministic responses, so there are fewer chances of hallucinations. A higher temperature results in a higher likelihood of hallucinations. via - https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html

Answer 26

Artificial Intelligence > Machine Learning > Deep Learning > Generative AI The correct hierarchy is as follows: Artificial Intelligence (AI): The broadest field encompassing all aspects of creating machines that can perform tasks that typically require human intelligence. Machine Learning (ML): A subset of AI focused on algorithms and statistical models that enable machines to improve their performance on tasks through experience. Deep Learning (DL): A subset of ML that uses neural networks with many layers to learn from large amounts of data, allowing for more complex and abstract representations. Generative AI (GenAI): A subset of Deep Learning focused on models that can generate new content, such as text, images, or music, by learning from existing data. via - https://docs.aws.amazon.com/whitepapers/latest/aws-caf-for-ai/aws-caf-for-ai.html

Answer 27

Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare tabular and image data for ML from weeks to minutes. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow (including data selection, cleansing, exploration, visualization, and processing at scale) from a single visual interface. You can use SQL to select the data that you want from various data sources and import it quickly. Next, you can use the data quality and insights report to automatically verify data quality and detect anomalies, such as duplicate rows and target leakage. SageMaker Data Wrangler contains over 300 built-in data transformations, so you can quickly transform data without writing code. With the SageMaker Data Wrangler data selection tool, you can quickly access and select your tabular and image data from various popular sources - such as Amazon Simple Storage Service (Amazon S3), Amazon Athena, Amazon Redshift, AWS Lake Formation, Snowflake, and Databricks - and over 50 other third-party sources - such as Salesforce, SAP, Facebook Ads, and Google Analytics. You can also write queries for data sources using SQL and import data directly into SageMaker from various file formats, such as CSV, Parquet, JSON, and database tables. How Data Wrangler works: via - https://aws.amazon.com/sagemaker/data-wrangler/

Answer 28

Amazon SageMaker Model Dashboard is a centralized portal, accessible from the SageMaker console, where you can view, search, and explore all of the models in your account. You can track which models are deployed for inference and if they are used in batch transform jobs or hosted on endpoints.

Answer 29

SageMaker Clarify helps identify potential bias during data preparation without writing code. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features.

Answer 30

Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference.

Answer 31

Leverage AWS Trainium for high-performance, cost-effective Deep Learning training. AWS Trainium is the machine learning (ML) chip that AWS purpose-built for deep learning (DL) training of 100B+ parameter models. Each Amazon Elastic Compute Cloud (Amazon EC2) Trn1 instance deploys up to 16 Trainium accelerators to deliver a high-performance, low-cost solution for DL training in the cloud. https://aws.amazon.com/machine-learning/trainium/

Answer 32

Leverage AWS Inferentia for the deep learning (DL) and generative AI inference applications AWS Inferentia accelerators are designed by AWS to deliver high performance at the lowest cost in Amazon EC2 for your deep learning (DL) and generative AI inference applications. The first-generation AWS Inferentia accelerator powers Amazon Elastic Compute Cloud (Amazon EC2) Inf1 instances, which deliver up to 2.3x higher throughput and up to 70% lower cost per inference than comparable Amazon EC2 instances. https://aws.amazon.com/machine-learning/inferentia/

Answer 33

Image processing focuses on enhancing and manipulating images for visual quality Image processing is primarily concerned with the techniques used to enhance and manipulate images, such as filtering, noise reduction, and image transformation.

Answer 34

Computer vision involves interpreting and understanding the content of images to make decisions Computer vision, on the other hand, focuses on interpreting and understanding the content of images to make decisions, such as object detection, facial recognition, and scene understanding. Computer vision often uses machine learning algorithms to achieve these tasks. https://aws.amazon.com/what-is/computer-vision/

Answer 35

Response length Response length represents the minimum or maximum number of tokens to return in the generated response. via - https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html

Answer 36

Stop sequences specify the sequences of characters that stop the model from generating further tokens. If the model generates a stop sequence that you specify, it will stop generating after that sequence.

Answer 37

Top P represents the percentage of most likely candidates that the model considers for the next token.

Answer 38

Top K represents the number of most likely candidates that the model considers for the next token.

Answer 39

(1) Understand and manage your cloud infrastructure on AWS Amazon Q Developer helps you understand and manage your cloud infrastructure on AWS. With this capability, you can list and describe your AWS resources using natural language prompts, minimizing friction in navigating the AWS Management Console and compiling all information from documentation pages. For example, you can ask Amazon Q Developer, “List all of my Lambda functions”. Then, Amazon Q Developer returns the response with a set of my AWS Lambda functions as requested, as well as deep links so you can navigate to each resource easily. (2) Get answers to your AWS account-specific cost-related questions using natural language Amazon Q Developer can get answers to AWS cost-related questions using natural language. This capability works by retrieving and analyzing cost data from AWS Cost Explorer. via - https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/what-is.html via - https://aws.amazon.com/blogs/aws/amazon-q-developer-now-generally-available-includes-new-capabilities-to-reimagine-developer-experience/

Answer 40

A rule-based application is the most suitable choice for this scenario. Probability questions, like calculating the chance of drawing a spade from a deck of cards, are based on well-defined mathematical rules and formulas. A rule-based system can be programmed with these rules to provide precise answers to such questions, making it an efficient and straightforward solution. This approach ensures accuracy, is easy to implement, and requires no training data, making it ideal for helping students understand fundamental mathematical concepts.

Answer 41

Reinforcement Learning (RL) is a machine learning technique used for decision-making tasks where an agent learns by interacting with an environment and receiving feedback in the form of rewards or penalties. RL is better suited for dynamic and complex environments, such as games or robotic control, where exploration and adaptation are necessary. It is not appropriate for solving straightforward mathematical problems with well-defined answers, as it does not leverage existing mathematical rules and requires significant computational resources for training.

Answer 42

Supervised learning involves training a model on a labeled dataset to predict outcomes based on input features. While effective for tasks like image recognition or language translation, it is not suitable for answering mathematical questions that have precise, rule-based answers. Building a dataset of probability questions and answers would be inefficient and unnecessary, as the app can directly use mathematical formulas to provide correct responses without requiring model training.

Answer 43

Unsupervised learning is designed to identify patterns and structures in data without any predefined labels, making it useful for tasks such as clustering or dimensionality reduction. However, it is not applicable for answering specific mathematical questions like those involving probability, which require exact calculations based on established mathematical principles. Therefore, unsupervised learning does not provide a direct or efficient means to achieve the app’s objective.

Answer 44

Amazon Inspector is an automated security assessment service that helps improve the security and compliance of applications deployed on AWS. It automatically assesses applications for exposure, vulnerabilities, and deviations from best practices, making it an essential tool for ensuring the security of AI systems.

Answer 45

AWS Config is a service that enables you to assess, audit, and evaluate the configurations of your AWS resources. While it is important for governance and compliance monitoring, it does not perform automated security assessments of applications.

Answer 46

AWS Audit Manager helps you continuously audit your AWS usage to simplify how you assess risk and compliance with regulations and industry standards. It focuses on audit and compliance reporting rather than automated security assessments.

Answer 47

AWS Artifact provides on-demand access to AWS’ compliance reports and online agreements. It helps with compliance reporting but does not offer automated security assessments of applications.

Answer 48

Data access control involves authentication and authorization of users Data access control is about managing who can access data and what actions they can perform, typically through mechanisms like authentication and authorization.

Answer 49

Data integrity ensures the data is accurate, consistent, and unaltered Data integrity, on the other hand, focuses on maintaining the accuracy, consistency, and trustworthiness of data throughout its lifecycle, ensuring that data remains unaltered and accurate during storage, processing, and transmission.

Answer 50

Model evaluation on Amazon Bedrock involves a comprehensive process of preparing data, training models, selecting appropriate metrics, testing and analyzing results, ensuring fairness and bias detection, tuning performance, and continuous monitoring. Model Evaluation on Amazon Bedrock helps you to incorporate Generative AI into your application by giving you the power to select the foundation model that gives you the best results for your particular use case.

Answer 51

Guardrails for Amazon Bedrock enables you to implement safeguards for your generative AI applications based on your use cases and responsible AI policies. You can create multiple guardrails tailored to different use cases and apply them across multiple foundation models (FM), providing a consistent user experience and standardizing safety and privacy controls across generative AI applications. You can use guardrails with text-based user inputs and model responses.

Answer 52

This tool is used for monitoring machine learning models in production to detect data and prediction quality issues. While it helps maintain model performance, it does not assist in model selection or content moderation.

Answer 53

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text. It is not specifically designed for selecting models or moderating content generated by LLMs.

Answer 54

Amazon SageMaker Clarify is used to detect bias in machine learning models and data. While it is crucial for ensuring fairness and transparency, it does not help with model selection or content moderation for generative AI applications.

Answer 55

Toxicity refers to AI model-generated content that can be deemed as offensive, disturbing, or inappropriate. an example of toxicity, where the AI model generates harmful or offensive content about a specific group.

Answer 56

Hallucination refers to AI model-generated assertions or claims that sound true but are incorrect an example of hallucination, where the AI model generates an irrelevant or incorrect response

Answer 57

Foundation models can perform a wide range of tasks across different domains by leveraging their extensive pre-training on large datasets Foundation models are a form of generative artificial intelligence (generative AI). They generate output from one or more inputs (prompts) in the form of human language instructions. In general, an FM uses learned patterns and relationships to predict the next item in a sequence. For example, with image generation, the model analyzes the image and creates a sharper, more clearly defined version of the image. Similarly, with text, the model predicts the next word in a string of text based on the previous words and their context. It then selects the next word using probability distribution techniques. Foundation models use self-supervised learning to create labels from input data. This means no one has instructed or trained the model with labeled training data sets. This feature separates LLMs from previous ML architectures, which use supervised or unsupervised learning. Foundation models, even though are pre-trained, can continue to learn from data inputs or prompts during inference. This means that you can develop comprehensive outputs through carefully curated prompts. Tasks that FMs can perform include language processing, visual comprehension, code generation, and human-centered engagement. via - https://aws.amazon.com/what-is/foundation-models/

Answer 58

Feature Engineering involves selecting, modifying, or creating features from raw data to improve the performance of machine learning models, and it is important because it can significantly enhance model accuracy and efficiency Feature Engineering is the process of selecting, modifying, or creating new features from raw data to enhance the performance of machine learning models. It is crucial because it can lead to significant improvements in model accuracy and efficiency by providing the model with better representations of the data. via - https://docs.aws.amazon.com/wellarchitected/latest/machine-learning-lens/feature-engineering.html

Answer 59

ChatGPT ChatGPT or Chat Generative Pretrained Transformer is an example of a Transformer model. Transformer-based models use a self-attention mechanism. They weigh the importance of different parts of an input sequence when processing each element in the sequence. To understand how transformer-based models work, imagine a sentence as a sequence of words. Self-attention helps the model focus on the relevant words as it processes each word. To capture different types of relationships between words, the transformer-based generative model employs multiple encoder layers called attention heads. Each head learns to attend to different parts of the input sequence. This allows the model to simultaneously consider various aspects of the data.

Answer 60

Diffusion models work by first corrupting data with noise through a forward diffusion process and then learning to reverse this process to denoise the data. They use neural networks to predict and remove the noise step by step, ultimately generating new, structured data from random noise.

Answer 61

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find meaning and insights in text. Natural Language Processing (NLP) is a way for computers to analyze, understand, and derive meaning from textual information in a smart and useful way. By utilizing NLP, you can extract important phrases, sentiments, syntax, key entities such as brand, date, location, person, etc., and the language of the text. You can use Amazon Comprehend to identify the language of the text, extract key phrases, places, people, brands, or events, understand sentiment about products or services, and identify the main topics from a library of documents. The source of this text could be web pages, social media feeds, emails, or articles. You can also feed Amazon Comprehend a set of text documents, and it will identify topics (or groups of words) that best represent the information in the collection. The output from Amazon Comprehend can be used to understand customer feedback, provide a better search experience through search filters, and use topics to categorize documents. How Amazon Comprehend works: via - https://aws.amazon.com/comprehend/

Answer 62

Amazon Transcribe is an automatic speech recognition service that uses machine learning models to convert audio to text. You can use Amazon Transcribe as a standalone transcription service or add speech-to-text capabilities to any application.

Answer 63

Amazon Translate is a text translation service that uses advanced machine learning technologies to provide high-quality translation on demand. You can use Amazon Translate to translate unstructured text documents or to build applications that work in multiple languages.

Answer 64

Amazon Rekognition is a cloud-based image and video analysis service that makes it easy to add advanced computer vision capabilities to your applications. You can add features that detect objects, text, and unsafe content, analyze images/videos, and compare faces to your application using Rekognition's APIs.

Answer 65

Difficulty in collecting and preparing high-quality data for training models One of the main challenges in machine learning implementation is the difficulty in collecting and preparing high-quality data for training models. High-quality data is essential for building effective machine learning models, and ensuring that the data is clean, relevant, and well-prepared can be a complex and time-consuming process. There are many machine learning algorithms available, but the challenge lies in other aspects of implementation. While computational power can be a challenge for very large models, it is not a primary challenge for most machine learning implementations due to the availability of powerful computing resources. Machine learning has a wide range of applications in real-world scenarios, and its use is not particularly limited.

Answer 66

By using techniques such as cross-validation, regularization, and pruning to simplify the model and improve its generalization To prevent overfitting, techniques such as cross-validation, regularization, and pruning are employed. Cross-validation helps ensure the model generalizes well to unseen data by dividing the data into multiple training and validation sets. Regularization techniques, such as L1 and L2 regularization, penalize complex models to reduce overfitting. Pruning simplifies decision trees by removing branches that have little importance. via - https://docs.aws.amazon.com/machine-learning/latest/dg/model-fit-underfitting-vs-overfitting.html Increasing the complexity of the model can lead to overfitting, as it may start capturing noise or random fluctuations in the training data. Training on a small subset of data may lead to underfitting rather than preventing overfitting. Avoiding model validation or testing does not prevent overfitting; it is essential to validate and test models to ensure they generalize well to new data.

Answer 67

Model training in deep learning involves using large datasets to adjust the weights and biases of a neural network through multiple iterations, using techniques such as gradient descent to minimize the error In Deep Learning, model training involves feeding large datasets into the neural network and adjusting the weights and biases through multiple iterations. Techniques such as gradient descent are used to minimize the error by computing the gradient of the loss function and updating the weights to reduce the prediction error. Model training in deep learning involves initializing a neural network, feeding it data, calculating losses, adjusting weights using optimization algorithms, and iterating through this process until the model achieves satisfactory performance. Proper data preparation, validation, and hyperparameter tuning are crucial steps to ensure the model generalizes well to new, unseen data. https://aws.amazon.com/what-is/artificial-intelligence/ Weights and biases in a neural network are not set manually; they are learned during the training process. Data is crucial for training deep learning models; the network learns from input data. Deep learning primarily uses neural networks rather than support vector machines and decision trees, which are more common in traditional machine learning.

Answer 68

Top P represents the percentage of most likely candidates that the model considers for the next token. Choose a lower value to decrease the size of the pool and limit the options to more likely outputs. Choose a higher value to increase the size of the pool and allow the model to consider less likely outputs. via - https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html

Answer 69

Temperature is a value between 0 and 1, and it regulates the creativity of the model's responses. Use a lower temperature if you want more deterministic responses, and use a higher temperature if you want more creative or different responses for the same prompt on Amazon Bedrock.

Answer 70

Top K represents the number of most likely candidates that the model considers for the next token. Choose a lower value to decrease the size of the pool and limit the options to more likely outputs. Choose a higher value to increase the size of the pool and allow the model to consider less likely outputs.

Answer 71

Stop sequences specify the sequences of characters that stop the model from generating further tokens. If the model generates a stop sequence that you specify, it will stop generating after that sequence.

Answer 72

Amazon Connect is the contact center service from AWS. Amazon Q helps customer service agents provide better customer service. Amazon Q in Connect uses real-time conversation with the customer along with relevant company content to automatically recommend what to say or what actions an agent should take to better assist customers.

Answer 73

Amazon Q Developer assists developers and IT professionals with all their tasks—from coding, testing, and upgrading applications, to diagnosing errors, performing security scanning and fixes, and optimizing AWS resources.

Answer 74

Amazon Q Business is a fully managed, generative-AI powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data. It allows end users to receive immediate, permissions-aware responses from enterprise data sources with citations, for use cases such as IT, HR, and benefits help desks.

Answer 75

With Amazon Q in QuickSight, customers get a generative BI assistant that allows business analysts to use natural language to build BI dashboards in minutes and easily create visualizations and complex calculations.

Answer 76

Semi-supervised learning is when you apply both supervised and unsupervised learning techniques to a common problem. This technique relies on using a small amount of labeled data and a large amount of unlabeled data to train systems. First, the labeled data is used to partially train the machine learning algorithm. After that, the partially trained algorithm labels the unlabeled data. This process is called pseudo-labeling. The model is then re-trained on the resulting data mix without being explicitly programmed. via - https://aws.amazon.com/compare/the-difference-between-machine-learning-supervised-and-unsupervised/ Fraud identification Within a large set of transactional data, there’s a subset of labeled data where experts have confirmed fraudulent transactions. For a more accurate result, the machine learning solution would train first on the unlabeled data and then with the labeled data. Sentiment analysis When considering the breadth of an organization’s text-based customer interactions, it may not be cost-effective to categorize or label sentiment across all channels. An organization could train a model on the larger unlabeled portion of data first, and then a sample that has been labeled. This would provide the organization with a greater degree of confidence in customer sentiment across the business.

Answer 77

supervised learning A neural network solution is a more complex supervised learning technique. To produce a given outcome, it takes some given inputs and performs one or more layers of mathematical transformation based on adjusting data weightings. An example of a neural network technique is predicting a digit from a handwritten image.

Answer 78

unsupervised learning Clustering is an unsupervised learning technique that groups certain data inputs, so they may be categorized as a whole. There are various types of clustering algorithms depending on the input data. An example of clustering is identifying different types of network traffic to predict potential security incidents.

Answer 79

unsupervised learning Dimensionality reduction is an unsupervised learning technique that reduces the number of features in a dataset. It’s often used to preprocess data for other machine learning functions and reduce complexity and overheads. For example, it may blur out or crop background features in an image recognition application.

Answer 80

With Amazon Bedrock, you will be charged for model inference and customization. You have a choice of two pricing plans for inference: 1. On-Demand and Batch: This mode allows you to use FMs on a pay-as-you-go basis without having to make any time-based term commitments. 2. Provisioned Throughput: This mode allows you to provision sufficient throughput to meet your application's performance requirements in exchange for a time-based term commitment. Smaller models are cheaper to use than larger models The cost of generative AI models can vary. It's important to weigh the trade-offs between model size and speed. Larger models tend to be more accurate but are costly and have limited deployment options. In contrast, smaller models are more affordable and faster, offering more deployment flexibility. You can use a customized model only in the Provisioned Throughput mode With the Provisioned Throughput mode, you can purchase model units for a specific base or custom model. The Provisioned Throughput mode is primarily designed for large consistent inference workloads that need guaranteed throughput. Custom models can only be accessed using Provisioned Throughput. via - https://aws.amazon.com/bedrock/pricing/

Answer 81

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to discover insights from text. Amazon Comprehend provides Custom Entity Recognition, Custom Classification, Key Phrase Extraction, Sentiment Analysis, Entity Recognition, and more APIs so you can easily integrate natural language processing into your applications. You simply call the Amazon Comprehend APIs in your application and provide the location of the source document or text. The APIs will output entities, key phrases, sentiment, and language in a JSON format, which you can use in your application. Amazon Comprehend ML capabilities can be used to detect and redact personally identifiable information (PII) in customer emails, support tickets, product reviews, social media, and more. No ML experience is required. For example, you can analyze support tickets and knowledge articles to detect PII entities and redact the text before you index the documents in the search solution. After that, search solutions are free of PII entities in documents. Redacting PII entities helps you protect privacy and comply with local laws and regulations. How Comprehend works: via - https://aws.amazon.com/comprehend/

Answer 82

Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, layout elements, and data from scanned documents. It goes beyond simple optical character recognition (OCR) to identify, understand, and extract specific data from documents. Text extracted from images can be sent to Amazon Comprehend to recognize PII.

Answer 83

Amazon Lex is a fully managed artificial intelligence (AI) service with advanced natural language models to design, build, test, and deploy conversational interfaces in applications. Amazon Lex leverages the power of Generative AI and Large Language Models (LLMs) to enhance the builder and customer experience.

Answer 84

Amazon Kendra is a highly accurate and easy-to-use enterprise search service that’s powered by machine learning (ML). It allows developers to add search capabilities to their applications so their end users can discover information stored within the vast amount of content spread across their company. Companies use Amazon Comprehend to filter out PII before pushing the documents/data to Kendra.

Answer 85

Amazon Mechanical Turk provides a marketplace for outsourcing various tasks to a distributed workforce Amazon Mechanical Turk provides a marketplace for outsourcing various tasks to a distributed workforce, while Amazon Ground Truth is specifically designed for creating labeled datasets for machine learning, incorporating both automated and human labeling Amazon Mechanical Turk provides an on-demand, scalable, human workforce to complete jobs that humans can do better than computers. Amazon Mechanical Turk software formalizes job offers to the thousands of Workers willing to do piecemeal work at their convenience. The software also retrieves work performed and compiles it for you, the Requester, who pays the Workers for satisfactory work (only). Optional qualification tests enable you to select competent Workers. Amazon Mechanical Turk (MTurk) is a marketplace that allows businesses to outsource tasks to a distributed workforce.

Answer 86

Amazon Ground Truth is specifically designed for creating labeled datasets for machine learning, incorporating both automated and human labeling Amazon Ground Truth helps you build high-quality training datasets for your machine learning models. With Amazon Ground Truth, you can use workers from either Amazon Mechanical Turk, a vendor company that you choose, or an internal, private workforce along with machine learning to enable you to create a labeled dataset. You can use the labeled dataset output from Amazon Ground Truth to train your own models. You can also use the output as a training dataset for an Amazon SageMaker model. Amazon Ground Truth is designed specifically for creating labeled datasets for machine learning, using both automated and human labeling, often leveraging MTurk for the human labeling component.

Answer 87

SageMaker Ground Truth enables the creation of high-quality labeled datasets by incorporating human feedback in the labeling process, which can be used to improve reinforcement learning models Amazon SageMaker Ground Truth offers the most comprehensive set of human-in-the-loop capabilities for incorporating human feedback across the ML lifecycle to improve model accuracy and relevancy. You can complete various human-in-the-loop tasks, from data generation and annotation to reward model generation, model review, and customization through a self-service or AWS managed offering. SageMaker Ground Truth helps in creating high-quality labeled datasets by incorporating human feedback, which is crucial for training and refining reinforcement learning models. This human feedback ensures that the data used for training accurately reflects real-world scenarios, enhancing the effectiveness of RLHF. SageMaker Ground Truth includes a data annotator for RLHF capabilities. You can give direct feedback and guidance on output that a model has generated by ranking, classifying, or doing both for its responses for RL outcomes. The data, referred to as comparison and ranking data, is effectively a reward model or reward function, which is then used to train the model. You can use comparison and ranking data to customize an existing model for your use case or to fine-tune a model that you build from scratch.

Answer 88

The bias versus variance trade-off refers to the challenge of balancing the error due to the model's complexity (variance) and the error due to incorrect assumptions in the model (bias), where high bias can cause underfitting and high variance can cause overfitting The bias versus variance trade-off in machine learning is about finding a balance between bias (error due to overly simplistic assumptions in the model, leading to underfitting) and variance (error due to the model being too sensitive to small fluctuations in the training data, leading to overfitting). The goal is to achieve a model that generalizes well to new data.

Answer 89

high variance

Answer 90

Infrastructure as a Service (IaaS) Cloud Computing can be broadly divided into three types - Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). IaaS contains the basic building blocks for cloud IT. It typically provides access to networking features, computers (virtual or on dedicated hardware), and data storage space. IaaS gives the highest level of flexibility and management control over IT resources. EC2 gives you full control over managing the underlying OS, virtual network configurations, storage, data, and applications. So EC2 is an example of an IaaS service. Overview of the types of Cloud Computing: via - https://aws.amazon.com/types-of-cloud-computing/

Answer 91

Platform as a Service (PaaS) - PaaS removes the need to manage underlying infrastructure (usually hardware and operating systems), and allows you to focus on the deployment and management of your applications. You don’t need to worry about resource procurement, capacity planning, software maintenance, patching, or any of the other undifferentiated heavy lifting involved in running your application. Elastic Beanstalk is an example of a PaaS service. You can simply upload your code and Elastic Beanstalk automatically handles the deployment, from capacity provisioning, load balancing, and auto-scaling to application health monitoring.

Answer 92

Software as a Service (SaaS) - SaaS provides you with a complete product that is run and managed by the service provider. With a SaaS offering, you don’t have to think about how the service is maintained or how the underlying infrastructure is managed. You only need to think about how you will use that particular software. AWS Rekognition is an example of a SaaS service.

Answer 93

Model parameters are values that define a model and its behavior in interpreting input and generating responses. Model parameters are values that define a model and its behavior in interpreting input and generating responses. Model parameters are controlled and updated by providers. You can also update model parameters to create a new model through the process of model customization. In other words, Model parameters are the internal variables of the model that are learned and adjusted during the training process. These parameters directly influence the output of the model for a given input. Examples include the weights and biases in a neural network. via - https://docs.aws.amazon.com/bedrock/latest/userguide/key-definitions.html

Answer 94

Hyperparameters are values that can be adjusted for model customization to control the training process Hyperparameters are values that can be adjusted for model customization to control the training process and, consequently, the output custom model. In other words, hyperparameters are external configurations set before the training process begins. They control the training process and the structure of the model but are not adjusted by the training algorithm itself. Examples include the learning rate, the number of layers in a neural network, etc.

Answer 95

A level of throughput that you purchase for a base or custom model in order to increase the amount and/or rate of tokens processed during model inference. When you purchase Provisioned Throughput for a model, a provisioned model is created that can be used to carry out model inference. For more information, see Increase model invocation capacity with Provisioned Throughput in Amazon Bedrock.

Answer 96

A user-friendly graphical interface in the AWS Management Console in which you can experiment with running model inference to familiarize yourself with Amazon Bedrock. Use the playground to test out the effects of different models, configurations, and inference parameters on the responses generated for different prompts that you enter. For more information, see Generate responses in a visual interface using playgrounds.

Answer 97

The process of coordinating between foundation models and enterprise data and applications in order to carry out a task. For more information, see Automate tasks in your application using conversational agents.

Answer 98

An application that carry out orchestrations through cyclically interpreting inputs and producing outputs by using a foundation model. An agent can be used to carry out customer requests. For more information, see Automate tasks in your application using conversational agents.

Answer 99

Machine learning is a subset of artificial intelligence that involves training algorithms to learn from data Machine learning models perform more specific data analysis tasks - like classifying transactions as genuine or fraudulent, labeling images, or predicting the maintenance schedule of factory equipment.

Answer 100

artificial intelligence encompasses a wider range of technologies aimed at simulating human intelligence Artificial intelligence is an umbrella term for different strategies and techniques used to make machines more human-like. AI includes everything from smart assistants like Alexa, chatbots, and image generators to robotic vacuum cleaners and self-driving cars. In contrast, machine learning is a subset of artificial intelligence, focusing specifically on training algorithms to learn from data and make predictions or decisions.

Answer 101

Use Amazon SageMaker Model Cards to document critical details about your machine learning (ML) models in a single place for streamlined governance and reporting. Catalog details such as the intended use and risk rating of a model, training details and metrics, evaluation results and observations, and additional call-outs such as considerations, recommendations, and custom information. Specifying the intended uses of a model helps ensure that model developers and users have the information they need to train or deploy the model responsibly. The intended uses of a model go beyond technical details and describe how a model should be used in production, the scenarios in which is appropriate to use a model, and additional considerations such as the type of data to use with the model or any assumptions made during development. https://docs.aws.amazon.com/sagemaker/latest/dg/model-cards.html

Answer 102

SageMaker Clarify helps identify potential bias during data preparation without writing code. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features.

Answer 103

SageMaker Canvas offers a no-code interface that can be used to create highly accurate machine learning models —without any machine learning experience or writing a single line of code. SageMaker Canvas provides access to ready-to-use models including foundation models from Amazon Bedrock or Amazon SageMaker JumpStart or you can build your custom ML model using AutoML powered by SageMaker AutoPilot.

Answer 104

Amazon SageMaker Model Monitor monitors the quality of Amazon SageMaker machine learning models in production. With Model Monitor, you can set up: Continuous monitoring with a real-time endpoint, Continuous monitoring with a batch transform job that runs regularly, and On-schedule monitoring for asynchronous batch transform jobs.

Answer 105

The Large Language Models (LLMs) are non-deterministic Large Language Models (LLMs) are non-deterministic, which implies that the generated text may be different for every user that uses the same prompt. You can use the inference parameter Temperature (having a value between 0 and 1), which regulates the creativity of LLMs’ responses. Use a lower temperature if you want more deterministic responses, and use a higher temperature if you want creative or different responses for the same prompt from LLMs on Amazon Bedrock. Large language models (LLMs) are one class of FMs.

Answer 106

Discriminative models are used for classification. They focus on distinguishing between different categories based on the features they observe.

Answer 107

Generative models are used for creation. They learn the patterns and features of the data they have seen and can generate new, similar data. LLMs are generative models.

Answer 108

Data residency is concerned with the physical location of data storage Data residency refers to the geographical or physical location where data is stored, which is crucial for compliance with regional laws and regulations.

Answer 109

data retention defines the policies for how long data should be stored and maintained Data retention, on the other hand, involves policies and practices related to how long data should be kept, archived, or deleted, ensuring that data is available when needed and disposed of when no longer required.

Answer 110

Amazon SageMaker Studio offers a broad set of fully managed integrated development environments (IDEs) for ML development, including JupyterLab, Code Editor based on Code-OSS (Visual Studio Code – Open Source), and RStudio.

Answer 111

The real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements. You can deploy your model to SageMaker hosting services and get an endpoint that can be used for inference. These endpoints are fully managed and support autoscaling.

Answer 112

Used for workloads that have idle periods between traffic spikes and can tolerate cold starts.

Answer 113

Used for requests with large payload sizes up to 1GB, long processing times, and near real-time latency requirements.

Answer 114

To get predictions for an entire dataset, use SageMaker batch transform.

Answer 115

Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference. For example, in an application that recommends a music playlist, features could include song ratings, listening duration, and listener demographics. You can ingest data into SageMaker Feature Store from a variety of sources, such as application and service logs, clickstreams, sensors, and tabular data from Amazon Simple Storage Service (Amazon S3), Amazon Redshift, AWS Lake Formation, Snowflake, and Databricks Delta Lake. How Feature Store works: via - https://aws.amazon.com/sagemaker/feature-store/

Answer 116

SageMaker Clarify helps identify potential bias during data preparation without writing code. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features.

Answer 117

Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare tabular and image data for ML from weeks to minutes. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow (including data selection, cleansing, exploration, visualization, and processing at scale) from a single visual interface.

Answer 118

Amazon SageMaker Ground Truth offers the most comprehensive set of human-in-the-loop capabilities, allowing you to harness the power of human feedback across the ML lifecycle to improve the accuracy and relevancy of models. You can complete a variety of human-in-the-loop tasks with SageMaker Ground Truth, from data generation and annotation to model review, customization, and evaluation, either through a self-service or an AWS-managed offering.

Answer 119

LLMs on Amazon Bedrock come with several inference parameters that you can set to control the response from the models. Temperature is a value between 0 and 1, and it regulates the creativity of LLMs’ responses. Use a lower temperature if you want more deterministic responses, and use a higher temperature if you want creative or different responses for the same prompt from LLMs on Amazon Bedrock. via - https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html

Answer 120

Top P represents the percentage of most likely candidates that the model considers for the next token. Choose a lower value to decrease the size of the pool and limit the options to more likely outputs. Choose a higher value to increase the size of the pool and allow the model to consider less likely outputs.

Answer 121

Top K represents the number of most likely candidates that the model considers for the next token. Choose a lower value to decrease the size of the pool and limit the options to more likely outputs. Choose a higher value to increase the size of the pool and allow the model to consider less likely outputs.

Answer 122

Stop sequences specify the sequences of characters that stop the model from generating further tokens. If the model generates a stop sequence that you specify, it will stop generating after that sequence.

Answer 123

Cloud computing is the on-demand delivery of IT resources over the Internet with pay-as-you-go pricing. Instead of buying, owning, and maintaining physical data centers and servers, you can access technology services, such as computing power, storage, and databases, on an as-needed basis from a cloud provider like Amazon Web Services (AWS).

Answer 124

Agility refers to the ability of the cloud to give you easy access to a broad range of technologies so that you can innovate faster and build nearly anything that you can imagine. You can quickly spin up resources as you need them – from infrastructure services, such as compute, storage, and databases, to the Internet of Things, machine learning, data lakes and analytics, and much more.

Answer 125

With cloud computing elasticity, you don’t have to over-provision resources upfront to handle peak levels of business activity in the future. Instead, you provision the number of resources that you need. You can scale these resources up or down instantly to grow and shrink capacity as your business needs change.

Answer 126

The cloud allows you to trade capital expenses (such as data centers and physical servers) for variable expenses, and only pay for IT as you consume it. Plus, the variable expenses are much lower than what you would pay to do it yourself because of the economies of scale.

Answer 127

With the cloud, you can expand to new geographic regions and deploy globally in minutes. For example, AWS has infrastructure all over the world, so you can deploy your application in multiple physical locations with just a few clicks. Putting applications in closer proximity to end users reduces latency and improves their experience.

Answer 128

Generative AI is important because it can autonomously create novel and complex data, enhancing creativity and efficiency in various domains Generative AI is important because it can autonomously create novel and complex data, which significantly enhances creativity and efficiency across various fields, such as content creation, design, and problem-solving. via - https://aws.amazon.com/what-is/generative-ai/

Answer 129

The company should use a multi-modal embedding model, which is designed to represent and align different types of data (such as text and images) in a shared embedding space, allowing the chatbot to understand and interpret both forms of input simultaneously A multi-modal embedding model is the most suitable choice for this task because it enables the integration of multiple types of data, such as text and images, into a unified representation. This allows the chatbot to effectively process and understand queries containing both text and visual content by aligning them in a shared embedding space, facilitating more accurate and context-aware responses. You can generate embeddings for your content and store them in a vector database. When an end user submits any combination of text and image as a search query, the model generates embeddings for the search query and matches them to the stored embeddings to provide relevant search and recommendations results to end users. For example, a stock photography company with hundreds of millions of images can use the model to power its search functionality, so users can search for images using a phrase, image, or a combination of image and text. You can further customize the model to enhance its understanding of your unique content and provide more meaningful results using image-text pairs for fine-tuning.

Answer 130

While a multi-modal generative model can generate outputs based on multiple types of input data, it is more complex and typically used for generating new content rather than interpreting and responding to queries. In addition, it is costlier to build and maintain a multi-modal generative model compared to a multi-modal embedding model. A multi-modal embedding model is more efficient for understanding and processing combined text and image inputs, whereas a generative model may be excessive if the primary goal is to process and respond to existing multi-modal queries.

Answer 131

A text-only language model cannot handle image data because it is designed to process and generate text exclusively. This model lacks the capability to understand or incorporate visual information, making it unsuitable for a chatbot that needs to interpret both text and images in user queries.

Answer 132

A convolutional neural network (CNN) is designed specifically for image recognition and processing tasks and is highly effective for analyzing visual data. However, it cannot process text-based inputs and therefore cannot fulfill the requirement of handling multi-modal queries that include both text and images. A CNN would need to be combined with other models to process text, which adds complexity without directly addressing the multi-modal nature of the queries.

Answer 133

The company should use a VPC endpoint for Amazon S3 that allows secure, private connectivity between the VPC and Amazon S3, without the need for an internet connection, ensuring data is transferred securely within the AWS network A VPC endpoint for Amazon S3 is the most appropriate choice because it creates a private connection between the VPC and Amazon S3 over the AWS network, without requiring internet access. This VPC endpoint allows the SageMaker model deployed within the VPC to securely access data from Amazon S3 directly, using the internal AWS network paths. It provides enhanced security by keeping data traffic within the AWS infrastructure and not exposing it to the public internet. You can use two types of VPC endpoints to access Amazon S3: gateway endpoints and interface endpoints (by using AWS PrivateLink). A gateway endpoint is a gateway that you specify in your route table to access Amazon S3 from your VPC over the AWS network. Interface endpoints extend the functionality of gateway endpoints by using private IP addresses to route requests to Amazon S3 from within your VPC, on premises, or from a VPC in another AWS Region by using VPC peering or AWS Transit Gateway.

Answer 134

An Internet Gateway enables VPC resources to communicate with the internet, but it is not suitable in this scenario because the VPC does not have internet access, and the objective is to securely access S3 without exposing data traffic to the public internet. Using an Internet Gateway would not only require additional security configurations but also go against the company's requirement to avoid internet access.

Answer 135

A SageMaker Inference endpoint allows clients to invoke deployed models and receive predictions but does not serve the purpose of connecting a SageMaker model to Amazon S3. It is not designed to handle data access between the SageMaker model and S3, so it would not meet the company's requirement for secure data retrieval from S3 within a VPC.

Answer 136

A NAT Gateway allows instances in a private subnet to access the internet, but it still routes traffic through the public internet, which may not align with the company's need for secure and private data transfer. Since the goal is to avoid internet access and maintain secure connectivity between the VPC and Amazon S3, a NAT Gateway is not the appropriate solution.

Answer 137

Amazon Rekognition is a cloud-based image and video analysis service that makes it easy to add advanced computer vision capabilities to your applications. The service is powered by proven deep learning technology and it requires no machine learning expertise to use. Amazon Rekognition includes a simple, easy-to-use API that can quickly analyze any image or video file that’s stored in Amazon S3. You can add features that detect objects, text, and unsafe content, analyze images/videos, and compare faces to your application using Rekognition's APIs. With Amazon Rekognition's face recognition APIs, you can detect, analyze, and compare faces for a wide variety of use cases, including user verification, cataloging, people counting, and public safety. via - https://docs.aws.amazon.com/rekognition/latest/dg/text-detection.html

Answer 138

Amazon Textract is a document analysis service that detects and extracts printed text, handwriting, structured data (such as fields of interest and their values), and tables from images and scans of documents. Amazon Textract's machine learning models have been trained on millions of documents so that virtually any document type you upload is automatically recognized and processed for text extraction. While Amazon Textract can detect text from images and documents from a wide range of file formats, Recognition is trained on locating and identifying even small text from moving videos and images at various angles. Hence, Recognition is optimal here.

Answer 139

The Amazon SageMaker image classification algorithm is a supervised learning algorithm that supports multi-label classification. It takes an image as input and outputs one or more labels assigned to that image. It uses a convolutional neural network that can be trained from scratch or trained using transfer learning when a large number of training images are not available. SageMaker image classification algorithms need certain ML experience to train and tune the model whereas Rekognition is already trained to identify labels.

Answer 140

Amazon SageMaker JumpStart is a machine learning (ML) hub that can help you accelerate your ML journey. With SageMaker JumpStart, you can evaluate, compare, and select Foundation Models (FMs) quickly based on pre-defined quality and responsibility metrics to perform tasks like article summarization and image generation. Pretrained models are fully customizable for your use case with your data, and you can easily deploy them into production with the user interface or SDK.

Answer 141

The company should use object detection, which involves identifying and locating specific objects within an image Object detection is the correct concept for this use case. It is a computer vision technique that identifies instances of objects within images and videos, such as detecting and classifying different types of animals. Object detection models, such as those based on Convolutional Neural Networks (CNNs) like YOLO (You Only Look Once) or Faster R-CNN, can analyze visual data to accurately locate and identify various animal species within an image, making it the most appropriate choice for this task.

Answer 142

Named Entity Recognition (NER) is a text-based natural language processing technique, not a computer vision method. It identifies and classifies named entities in text, such as people, organizations, or locations. Since NER does not handle visual data or the identification of objects in images, it is not suitable for recognizing different types of animals in images.

Answer 143

Face recognition is a computer vision technique that focuses exclusively on detecting and verifying human faces in images or video streams. It uses algorithms to match facial features against a database of known faces but is not designed to recognize or classify non-human objects, such as different animal species. As a result, face recognition cannot be applied effectively for identifying various types of animals in images.

Answer 144

Poisoning refers to the intentional introduction of malicious or biased data into the training dataset of a model which leads to the model producing biased, offensive, or harmful outputs (intentionally or unintentionally). where the response includes a harmful or malicious suggestion (link to a malicious website).

Answer 145

Prompt Leaking refers to the unintentional disclosure or leakage of the prompts or inputs used within a model. It can expose protected data or other data used by the model, such as how the model works. where the AI model refers to information from a previous session, potentially revealing private or sensitive information that the user did not ask for in the current session.

Answer 146

A hiring algorithm consistently prefers candidates from a particular gender, even though the candidates' qualifications are similar across genders Biases are imbalances in data or disparities in the performance of a model across different groups. Bias may also be introduced by the ML algorithm itself—even with a well-balanced training dataset, the outcomes might favor certain subsets of the data as compared to others. This scenario illustrates algorithmic bias, where the hiring algorithm systematically favors candidates of a particular gender, indicating that there may be a bias in the training data or in the algorithm's design that leads to unequal treatment based on gender.

Answer 147

Dynamic prompt engineering involves modifying the input prompts to the Large Language Model (LLM) to customize the chatbot's responses based on the user's age. By altering the prompt dynamically, you can provide specific instructions or context to the LLM to generate age-appropriate responses. For example, if the user is a child, the prompt might include instructions to use simpler language or a friendly tone. This approach does not require changing the model itself and leverages Amazon Bedrock’s ability to interpret context from customized prompts effectively. To provide custom responses via an LLM chatbot built using Amazon Bedrock based on the user's age, you can implement a strategy that dynamically adjusts the chatbot's responses according to the age group of the user. For the given use case, you can leverage Amazon Bedrock to build a custom prompt logic for the LLM that dynamically adjusts the input prompt based on the user's age category, like the following example in Python: def generate_prompt(age_group, user_input): if age_group == 'Children': prompt = f"Respond in a friendly and simple manner suitable for children: {user_input}" elif age_group == 'Teens': prompt = f"Respond in a fun and engaging way suitable for teens, using popular culture references: {user_input}" elif age_group == 'Adults': prompt = f"Provide a concise and informative response suitable for adults: {user_input}" else: # Seniors prompt = f"Provide a clear and respectful response with detailed explanations suitable for seniors: {user_input}" return prompt Then, use the Amazon Bedrock API to send the customized prompts to the foundation model. The Bedrock service will generate responses based on the context provided in each prompt, adapting the output to fit the desired style and tone for the specific age group.

Answer 148

RAG is a technique that combines a retrieval mechanism (which fetches relevant documents or data from a knowledge base) with a generation model to provide more factual and context-rich responses. While RAG can enhance response accuracy by adding external context, it is not specifically designed for customizing responses based on user characteristics like age. RAG focuses on improving the relevance and factual accuracy of outputs, not on adapting the style or complexity of the language to suit different age groups.

Answer 149

Re-training the model involves using a large dataset to update the entire model's parameters, which is time-consuming, costly, and unnecessary for simply tailoring responses based on user age. Amazon Bedrock provides access to pre-trained foundation models that are already capable of generating diverse outputs based on the input prompts. Re-training is overkill for this task and is not the appropriate solution for generating age-specific responses dynamically.

Answer 150

Fine-tuning involves training the LLM on a specialized dataset to improve its performance on specific tasks or domains. However, this method is more suited for developing domain-specific expertise in the model rather than adjusting the style or tone of responses based on user age. Fine-tuning can be resource-intensive and time-consuming, and it is not necessary for generating age-appropriate responses when prompt engineering can dynamically handle the customization without modifying the model itself.

Answer 151

Overfitting occurs when a model performs well on the training data but poorly on new, unseen data Overfitting happens when a model learns the training data too well, including noise and outliers, leading to excellent performance on the training data but poor generalization to new, unseen data.

Answer 152

Underfitting occurs when a model performs poorly on both the training data and new, unseen data Underfitting occurs when a model is too simplistic to capture the underlying patterns in the data, resulting in poor performance on both the training data and new data.

Answer 153

Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare tabular and image data for ML from weeks to minutes. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow (including data selection, cleansing, exploration, visualization, and processing at scale) from a single visual interface. You can use SQL to select the data that you want from various data sources and import it quickly. Next, you can use the data quality and insights report to automatically verify data quality and detect anomalies, such as duplicate rows and target leakage. SageMaker Data Wrangler contains over 300 built-in data transformations, so you can quickly transform data without writing code.

Answer 154

Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference. For example, in an application that recommends a music playlist, features could include song ratings, listening duration, and listener demographics.

Answer 155

SageMaker Clarify helps identify potential bias during data preparation without writing code. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features.

Answer 156

Amazon SageMaker Ground Truth offers the most comprehensive set of human-in-the-loop capabilities, allowing you to harness the power of human feedback across the ML lifecycle to improve the accuracy and relevancy of models. You can complete a variety of human-in-the-loop tasks with SageMaker Ground Truth, from data generation and annotation to model review, customization, and evaluation, either through a self-service or an AWS-managed offering.

Answer 157

(1) Validation sets are optional (2) Test set is used to determine how well the model generalizes Data used for ML is typically split into the following datasets: The training set is used to train the model, the validation set is used for tuning hyperparameters and selecting the best model during the training process, and the test set is used for evaluating the final performance of the model on unseen data. (1) Validation sets are optional The validation set introduces new data to the trained model. You can use a validation set to periodically measure model performance as training is happening and also tune any hyperparameters of the model. However, validation datasets are optional. Only validation sets are optional. (2) Test set is used to determine how well the model generalizes The test set is used on the final trained model to assess its performance on unseen data. This helps determine how well the model generalizes. Only the test set is used to determine how well the model generalizes.

Answer 158

The training set is used to train the model

Answer 159

the validation set is used for tuning hyperparameters and selecting the best model during the training process

Answer 160

the test set is used for evaluating the final performance of the model on unseen data.

Answer 161

Interpretability is about understanding the internal mechanisms of a machine learning model Interpretability refers to how easily a human can understand the reasoning behind a model's predictions or decisions. It's about making the inner workings of a machine learning model transparent and comprehensible. https://docs.aws.amazon.com/whitepapers/latest/model-explainability-aws-ai-ml/interpretability-versus-explainability.html

Answer 162

Explainability focuses on providing understandable reasons for the model's predictions and behaviors to stakeholders Explainability goes a step further by providing insights into why a model made a specific prediction, especially when the model itself is complex and not inherently interpretable. It involves using methods and tools to make the predictions of complex models understandable to humans.

Answer 163

Generative models focus on generating new data from learned patterns, whereas discriminative models classify data by distinguishing between different classes Generative models learn the underlying patterns of data to create new, similar data, while discriminative models learn to distinguish between different classes of data. Generative models, such as GPT-3, can generate new content, whereas discriminative models are used for classification tasks. The former focuses on understanding and replicating the data distribution, while the latter focuses on decision boundaries to classify inputs. For example, discriminative models look at images - known data like pixel arrangement, line, color, and shape — and then map them to an outcome — the unknown factor. Mathematically, these models work by identifying equations that could numerically map unknown and known factors as x and y variables. Generative models take this one step further. Instead of predicting a label given some features, they try to predict features given a certain label. Mathematically, generative modeling calculates the probability of x and y occurring together. It learns the distribution of different data features and their relationships. For example, generative models analyze animal images to record variables like different ear shapes, eye shapes, tail features, and skin patterns. They learn features and their relations to understand what different animals look like in general. They can then recreate new animal images that were not in the training set.

Answer 164

Discriminative models are used primarily for classification, not for generating new data. Discriminative models can be used for both text and image classification, while generative models learn the underlying patterns of data to create new data.

Answer 165

The company should iteratively test and adjust the chatbot prompts to ensure that its outputs consistently reflect the company's tone and style This is the correct approach because it directly focuses on fine-tuning the chatbot’s behavior via prompt engineering. Experimenting with and refining the prompt allows the company to guide the chatbot towards generating responses that are aligned with its specific tone and communication style. This process involves providing clear instructions or examples in the prompt and making iterative adjustments based on the chatbot's output until the desired tone is achieved.

Answer 166

The model can recreate new animal images that were not in the training dataset Generative artificial intelligence (generative AI) is a type of AI that can create new content and ideas, including conversations, stories, images, videos, and music. AI technologies attempt to mimic human intelligence in nontraditional computing tasks like image recognition, natural language processing (NLP), and translation. Generative models can analyze animal images to record variables like different ear shapes, eye shapes, tail features, and skin patterns. They learn features and their relations to understand what different animals look like in general. They can then recreate new animal images that were not in the training set. via - https://aws.amazon.com/what-is/generative-ai/

Answer 167

Traditional machine learning models were discriminative or focused on classifying data points. They attempted to determine the relationship between known and unknown factors. For example, they look at images—known data like pixel arrangement, line, color, and shape—and map them to words—the unknown factor. Only discriminative models can act as single-class classifiers or multi-class classifiers. Therefore, both these options are incorrect.

Answer 168

The company should use Accuracy, which measures the proportion of correctly predicted instances (both true positives and true negatives) out of the total number of instances Accuracy is the most appropriate metric when the goal is to understand the proportion of correct outcomes in a binary classification problem. It provides a straightforward measure of how often the model correctly predicts the positive and negative classes. Accuracy is a suitable choice when the dataset is balanced (i.e., the number of positive and negative instances is approximately equal) and when the company wants a simple, overall performance measure of the model's correctness.

Answer 169

RMSE is a metric used to measure the average magnitude of errors in a regression model's predictions. It is not appropriate for binary classification tasks because it is designed to assess continuous numeric predictions rather than categorical outcomes. Therefore, RMSE does not provide meaningful insights into the correct or incorrect outcomes in a classification context.

Answer 170

R-squared is a metric that measures the goodness of fit in regression models. It shows how well the independent variables explain the variance in the dependent variable. Since R-squared is specific to regression tasks and not applicable to classification problems, it is not a suitable metric for evaluating the correct outcomes in a binary classification scenario.

Answer 171

The F1 Score is a useful metric when dealing with imbalanced datasets in binary classification, as it balances precision (the proportion of true positive predictions among all positive predictions) and recall (the proportion of true positives among all actual positive instances). However, if the primary focus is simply to measure the correct outcomes without concern for the balance between precision and recall, then accuracy is a more straightforward metric. F1 Score is most appropriate when both false positives and false negatives need to be minimized, but it may not be necessary if the dataset is balanced and the company only wants to know the overall proportion of correct predictions.

Answer 172

With IAM Identity Center, you can create or connect workforce users and centrally manage their access across all their AWS accounts and applications. You need to configure an IAM Identity Center instance for your Amazon Q Business application environment with users and groups added. Amazon Q Business supports both organization and account-level IAM Identity Center instances.

Answer 173

An AWS account is a container for your AWS resources. You create and manage your AWS resources in an AWS account, and the AWS account provides administrative capabilities for access and billing.

Answer 174

AWS IAM service is a powerful tool for securely managing access to your AWS resources. One of the primary benefits of using IAM is the ability to grant shared access to your AWS account. Additionally, IAM allows you to assign granular permissions, enabling you to control exactly what actions different users can perform on specific resources.

Answer 175

An IAM user is an entity that you create in AWS. The IAM user represents the human user or workload who uses the IAM user to interact with AWS. A user in AWS consists of a name and credentials.

Answer 176

FMs use unlabeled training data sets for self-supervised learning In supervised learning, you train the model with a set of input data and a corresponding set of paired labeled output data. Unsupervised machine learning is when you give the algorithm input data without any labeled output data. Then, on its own, the algorithm identifies patterns and relationships in and between the data. Self-supervised learning is a machine learning approach that applies unsupervised learning methods to tasks usually requiring supervised learning. Instead of using labeled datasets for guidance, self-supervised models create implicit labels from unstructured data. Foundation models use self-supervised learning to create labels from input data. This means no one has instructed or trained the model with labeled training data sets.

Answer 177

Customer service chatbot Medical queries chatbot To equip foundation models (FMs) with up-to-date and proprietary information, organizations use Retrieval Augmented Generation (RAG), a technique that fetches data from company data sources and enriches the prompt to provide more relevant and accurate responses. Knowledge Bases for Amazon Bedrock is a fully managed capability that helps you implement the entire RAG workflow from ingestion to retrieval and prompt augmentation without having to build custom integrations to data sources and manage data flows. Some of the common use cases that can be addressed via RAG in Amazon Bedrock are customer service chatbot, medical queries chatbot, legal research and analysis, etc.

Answer 178

Applying multiple layers of security measures including input validation, access controls, and continuous monitoring to address vulnerabilities Architecting a defense-in-depth security approach involves implementing multiple layers of security to protect generative AI applications. This includes input validation to prevent malicious data inputs, strict access controls to limit who can interact with the AI models, and continuous monitoring to detect and respond to security incidents. These measures can help address common vulnerabilities and meet the best practices for securing generative AI applications on AWS.

Answer 179

Through the no-code interface of SageMaker Canvas, you can create highly accurate machine-learning models — without any machine-learning experience or writing a single line of code. SageMaker Canvas provides access to ready-to-use models including foundation models from Amazon Bedrock or Amazon SageMaker JumpStart or you can build your custom ML model using AutoML powered by SageMaker AutoPilot. With SageMaker Canvas, you can use SageMaker Data Wrangler to easily access and import data from 50+ sources, prepare data using natural language and 300+ built-in transforms, build and train highly accurate models, generate predictions, and deploy models to production. Amazon SageMaker Canvas provides a visual point-and-click interface for business analysts to solve business problems using ML such as customer churn prediction, fraud detection, forecasting financial metrics and sales, inventory optimization, content generation, and more without writing any code. How SageMaker Canvas works: via - https://docs.aws.amazon.com/sagemaker/latest/dg/canvas.html

Answer 180

Amazon SageMaker Model Dashboard is a centralized portal, accessible from the SageMaker console, where you can view, search, and explore all of the models in your account. You can track which models are deployed for inference and if they are used in batch transform jobs or hosted on endpoints.

Answer 181

Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare tabular and image data for ML from weeks to minutes. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow (including data selection, cleansing, exploration, visualization, and processing at scale) from a single visual interface.

Answer 182

SageMaker Clarify helps identify potential bias during data preparation without writing code. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features.

Answer 183

Human-in-the-loop is the process of harnessing human input across the ML lifecycle to improve the accuracy and relevancy of models. Humans can perform a variety of tasks, from data generation and annotation, to model review and customization. Human intervention is especially important for generative AI applications, where humans are typically both the requester and consumer of the content. It is therefore critical that humans train foundation models (FMs) how to respond accurately, safely, and relevantly to users’ prompts. Human feedback can be applied to help you complete multiple tasks. Amazon SageMaker Ground Truth offers the most comprehensive set of human-in-the-loop capabilities. There are two ways to use Amazon SageMaker Ground Truth, a self-service offering and an AWS-managed offering. In the self-service offering, your data annotators, content creators, and prompt engineers (in-house, vendor-managed, or leveraging the public crowd) can use our low-code user interface to accelerate human-in-the-loop tasks, while having the flexibility to build and manage your custom workflows. In the AWS-managed offering (SageMaker Ground Truth Plus), AWS handles the heavy lifting, which includes selecting and managing the right workforce for your use case. SageMaker Ground Truth Plus designs and customizes an end-to-end workflow (including detailed workforce training and quality assurance steps) and provides a skilled AWS-managed team that is trained on the specific tasks and meets your data quality, security, and compliance requirements. AWS-Managed Amazon SageMaker Ground Truth: via - https://aws.amazon.com/sagemaker/groundtruth/

Answer 184

You use Amazon SageMaker Role Manager to build and manage persona-based IAM roles for common machine learning needs directly through the Amazon SageMaker console. Amazon SageMaker Role Manager provides 3 preconfigured role personas and predefined permissions for 12 common ML activities.

Answer 185

Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference. For example, in an application that recommends a music playlist, features could include song ratings, listening duration, and listener demographics.

Answer 186

SageMaker Clarify helps identify potential bias during data preparation without writing code. You specify input features, such as gender or age, and SageMaker Clarify runs an analysis job to detect potential bias in those features.

Answer 187

Machine Learning models can be deterministic or probabilistic or a mix of both Machine Learning models can be deterministic or probabilistic or a mix of both, depending on their nature and how they are designed to operate. Deterministic models always produce the same output given the same input. Their behavior is predictable and consistent. Example: Decision Trees: Given the same input data, a decision tree will always follow the same path and produce the same output. Probabilistic models provide a distribution of possible outcomes rather than a single output. They incorporate uncertainty and randomness in their predictions. Example: Bayesian Networks: These models represent probabilistic relationships among variables and provide probabilities for different outcomes. Some models combine both deterministic and probabilistic elements, such as neural networks and random forests. via - https://aws.amazon.com/what-is/machine-learning/

Answer 188

Amazon SageMaker Clarify is a service provided by AWS (Amazon Web Services) that helps developers detect biases and explain the predictions made by machine learning models. It is part of the Amazon SageMaker suite of machine learning tools and focuses on enhancing transparency, fairness, and explainability in machine learning workflows.

Answer 189

Amazon SageMaker Model Monitor is a service within the Amazon SageMaker suite that helps developers continuously monitor machine learning models deployed in production. It ensures that models maintain optimal performance and make accurate predictions over time by detecting data quality issues, concept drift, and other anomalies. Amazon SageMaker Model Monitor automatically detects and alerts you to inaccurate predictions from deployed models.

Answer 190

Amazon SageMaker JumpStart is a machine learning hub with foundation models, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks.

Answer 191

Amazon Inspector is an automated vulnerability management service that continually scans AWS workloads for software vulnerabilities and unintended network exposure.

Answer 192

AWS Audit Manager helps you assess internal risk with prebuilt frameworks that translate evidence from cloud services into security IT audit reports.

Answer 193

Chain-of-thought prompting is a technique that breaks down a complex question into smaller, logical parts that mimic a train of thought. This helps the model solve problems in a series of intermediate steps rather than directly answering the question. This enhances its reasoning ability. It involves guiding the model through a step-by-step process to arrive at a solution or generate content, thereby enhancing the quality and coherence of the output. via - https://aws.amazon.com/what-is/prompt-engineering/

Answer 194

Negative prompting refers to guiding a generative AI model to avoid certain outputs or behaviors when generating content. In the context of AWS generative AI, like those using Amazon Bedrock, negative prompting is used to refine and control the output of models by specifying what should not be included in the generated content. For example, when generating text for a marketing campaign, a company can use negative prompts to exclude competitive brand names or sensitive topics.

Answer 195

In zero shot prompting, you present a task to the model without providing examples or explicit training for that specific task.

Answer 196

In few shot prompting, you provide a few examples of a task to the model to guide its output.

Answer 197

Amazon Augmented AI (A2I) is a service that helps implement human review workflows for machine learning predictions. It integrates human judgment into ML workflows, allowing for reviews and corrections of model predictions, which is critical for applications requiring high accuracy and accountability.

Answer 198

Amazon SageMaker Model Monitor is a service that continuously monitors the quality of machine learning models in production and helps detect data drift, model quality issues, and anomalies. It ensures that models perform as expected and alerts users to issues that might require human intervention.

Answer 199

Amazon SageMaker Data Wrangler is designed to simplify and streamline the process of data preparation for machine learning, not specifically for monitoring or human review.

Answer 200

Amazon SageMaker Feature Store is a fully managed repository for storing, updating, and retrieving machine learning features. While it aids in managing features used by models, it does not directly handle monitoring or human review processes.

Answer 201

Amazon SageMaker Ground Truth is used for building highly accurate training datasets for machine learning quickly. It does involve human annotators for labeling data, but it is not specifically designed for monitoring or human review of model predictions in production.

Answer 202

They facilitate easier debugging and optimization Transparent models allow developers to understand how inputs are transformed into outputs, making it easier to identify and correct errors or inefficiencies in the model. This capability is crucial for optimizing the model’s performance and ensuring it behaves as expected. They foster trust and confidence in model predictions When stakeholders can understand the decision-making process of a model, it builds trust in its predictions. Transparency is key in high-stakes scenarios, such as healthcare or finance, where understanding the rationale behind predictions is critical for acceptance and trust.

Answer 203

Amazon Q Developer can suggest code snippets, providing developers with recommendations for code based on specific tasks or requirements This is the correct option because Amazon Q Developer is designed to assist developers by providing code suggestions and recommendations that align with their coding tasks. It leverages machine learning models trained on vast datasets to suggest code snippets, optimize code efficiency, and help developers follow best practices. This functionality helps speed up development processes and enhances productivity.

AI Practice Test #2 Flashcards

(228 cards)