AI Practice Test #1.5 Flashcards

1
Q

Asynchronous inference

A

Asynchronous inference is the most suitable choice for this scenario. It allows the company to process smaller payloads without requiring real-time responses by queuing the requests and handling them in the background. This method is cost-effective and efficient when some delay is acceptable, as it frees up resources and optimizes compute usage. Asynchronous inference is ideal for scenarios where the payload size is less than 1 GB and immediate results are not critical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Batch inference

A

Batch inference is generally used for processing large datasets all at once. While it does not require immediate responses, it is typically more efficient for handling larger payloads (several gigabytes or more). For smaller payloads of less than 1 GB, batch inference might be overkill and less cost-efficient compared to asynchronous inference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Real-time inference

A

Real-time inference is optimized for scenarios where low latency is essential, and responses are needed immediately. It is not suitable for cases where the system can afford to wait for responses, as it might lead to higher costs and resource consumption without providing any additional benefit for this particular use case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Serverless inference

A

Serverless inference is a good choice for workloads with unpredictable traffic or sporadic requests, as it scales automatically based on demand. However, it may not be as cost-effective for scenarios where workloads are predictable, and some waiting time is acceptable. Asynchronous inference provides a more targeted solution for handling delayed responses at a lower cost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the key constituents of a good prompting technique in this context?

A

(1) Instructions – a task for the model to do (description, how the model should perform)

(2) Context – external information to guide the model

(3) Input data – the input for which you want a response

(4) Output Indicator – the output type or format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Hyperparameters

A

Hyperparameters are values that can be adjusted for model customization to control the training process and, consequently, the output custom model. In other words, hyperparameters are external configurations set before the training process begins. They control the training process and the structure of the model but are not adjusted by the training algorithm itself. Examples include the learning rate, the number of layers in a neural network, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Model parameters

A

Model parameters are values that define a model and its behavior in interpreting input and generating responses. Model parameters are controlled and updated by providers. You can also update model parameters to create a new model through the process of model customization. In other words, Model parameters are the internal variables of the model that are learned and adjusted during the training process. These parameters directly influence the output of the model for a given input. Examples include the weights and biases in a neural network.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Few-shots prompting

A

The data should include user-input along with the correct user intent, providing examples of user queries and the corresponding intent

This is the correct answer because few-shots prompting involves providing the model with examples that include both the user-input and the correct user intent. These examples help the model understand and learn how to map various user queries to their appropriate intents. By repeatedly seeing this pairing, the model can generalize from these examples and improve its ability to recognize user intent in new, unseen queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Retrieval-Augmented Generation (RAG)

A

Utilize a Retrieval-Augmented Generation (RAG) system by indexing all product catalog PDFs and configuring the LLM chatbot to reference this system for answering queries

Using a RAG approach is the least costly and most efficient solution for providing up-to-date and relevant responses. In this approach, you convert all product catalog PDFs into a searchable knowledge base. When a customer query comes in, the RAG framework first retrieves the most relevant pieces of information from this knowledge base and then uses an LLM to generate a coherent response based on the retrieved context. This method does not require re-training the model or modifying every incoming query with large datasets, making it significantly more cost-effective. It ensures that the chatbot always has access to the most recent information without needing expensive updates or processing every time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Stable Diffusion

A

Stable Diffusion is a generative artificial intelligence (generative AI) model that produces unique photorealistic images from text and image prompts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Llama

A

Llama is a series of large language models trained on publicly available data. They are built on the transformer architecture, enabling them to handle input sequences of any length and produce output sequences of varying lengths. A notable feature of Llama models is their capacity to generate coherent and contextually appropriate text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Jurassic

A

Jurassic family of models from AI21 Labs supported use cases such as question answering, summarization, draft generation, advanced information extraction, and ideation for tasks requiring intricate reasoning and logic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Claude

A

Claude is Anthropic’s frontier, state-of-the-art large language model that offers important features for enterprises like advanced reasoning, vision analysis, code generation, and multilingual processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Amazon Comprehend

A

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to uncover insights and relationships in text. It is specifically designed for tasks such as sentiment analysis, entity recognition, key phrase extraction, and language detection. For the scenario of analyzing customer reviews, Amazon Comprehend can directly determine the overall sentiment of a text (positive, negative, neutral, or mixed), making it the ideal service for this purpose. By using Amazon Comprehend, e-commerce platforms can effectively analyze customer feedback, understand customer satisfaction levels, and identify common themes or concerns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Amazon Bedrock

A

Amazon Bedrock is an AI service that provides access to foundation models (large language models, including those for NLP tasks) via an API. While Amazon Bedrock is not specifically an NLP service like Amazon Comprehend, it can be used to fine-tune pre-trained foundation models for various tasks, including sentiment analysis. With the proper configuration and fine-tuning, Bedrock can analyze text data to determine sentiment, making it a versatile option for advanced users who may need more customizable solutions than Amazon Comprehend.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Amazon Rekognition

A

Amazon Rekognition is a service designed for analyzing images and videos, not text. It can identify objects, people, text within images, and even detect inappropriate content in images and videos. However, it does not provide any capabilities for natural language processing or sentiment analysis, making it unsuitable for analyzing written customer reviews.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Amazon Textract

A

Amazon Textract is an OCR (Optical Character Recognition) service that extracts printed or handwritten text from scanned documents, PDFs, and images. It is useful for digitizing text but does not offer any features for analyzing or interpreting the sentiment of the extracted text. Since Textract focuses on text extraction rather than understanding or analyzing the content, it is not suitable for sentiment analysis tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Amazon Personalize

A

Amazon Personalize is a service that provides personalized recommendations, search, and ranking for websites and applications based on user behavior and preferences. While it can help improve customer experience by suggesting products or content based on historical data, it does not offer natural language processing or sentiment analysis capabilities. Thus, it is not the correct choice for analyzing written customer reviews to determine sentiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Model invocation logging

A

The company should enable model invocation logging, which allows for detailed logging of all requests and responses during model invocations in Amazon Bedrock

You can use model invocation logging to collect invocation logs, model input data, and model output data for all invocations in your AWS account used in Amazon Bedrock. With invocation logging, you can collect the full request data, response data, and metadata associated with all calls performed in your account. Logging can be configured to provide the destination resources where the log data will be published. Supported destinations include Amazon CloudWatch Logs and Amazon Simple Storage Service (Amazon S3). Only destinations from the same account and region are supported. Model invocation logging is disabled by default.

This is the correct option because enabling invocation logging on Amazon Bedrock allows the company to capture detailed logs of all model requests and responses, including input data, output predictions, and any errors that occur during model execution. This method provides comprehensive monitoring capabilities, enabling the company to effectively track, audit, and troubleshoot model performance and usage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

AWS CloudTrail

A

While AWS CloudTrail is useful for tracking API calls and monitoring who accessed which AWS resources, it does not capture the actual input and output data involved in model invocations. CloudTrail logs are primarily intended for auditing access and managing security rather than monitoring detailed data flow or model performance on Amazon Bedrock.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Amazon EventBridge

A

Amazon EventBridge is designed to react to changes and events across AWS resources and trigger workflows or automate responses. Although it can track when a model invocation occurs, it does not provide detailed logging of the input and output data associated with these invocations, limiting its usefulness for comprehensive monitoring purposes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

AWS Config

A

AWS Config is specifically designed for monitoring and managing AWS resource configurations and compliance, not for tracking or logging the input and output data of machine learning models on Amazon Bedrock. AWS Config focuses on configuration management and does not provide the level of detail required to monitor data traffic or model performance in machine learning applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Generative Adversarial Network (GAN)

A

The company should use a Generative Adversarial Network (GAN) for creating realistic synthetic data while preserving the statistical properties of the original data

This is the correct answer because GANs are specifically designed for generating synthetic data that is statistically similar to real data. They consist of two neural networks—a generator and a discriminator—that work against each other to create highly realistic synthetic data. GANs have been successfully used in various domains, including image generation, text synthesis, and more, to produce data that retains the underlying patterns and structures of the original dataset, making them highly suitable for this purpose.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Support Vector Machines (SVMs)

A

SVMs are used for classification and regression, where the algorithm finds the optimal hyperplane that best separates different classes in the data. SVMs do not generate new data or create synthetic datasets, so they are not suitable for a task that requires generating synthetic data based on existing datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Convolutional Neural Network (CNN)

A

CNNs are designed for tasks such as image and video recognition, object detection, and similar applications involving grid-like data (such as pixels in an image). While CNNs are excellent at feature extraction and classification in images, they are not suitable for generating synthetic data, especially for non-visual data types.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

WaveNet

A

WaveNet is tailored for audio data generation, specifically for tasks such as speech synthesis and audio signal processing. While it is powerful within its specific domain, it is not designed for generating synthetic data outside of audio, making it an unsuitable choice for general-purpose data generation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Exploratory Data Analysis (EDA)

A

The company is in the Exploratory Data Analysis (EDA) phase, which involves examining the data through statistical summaries and visualizations to identify patterns, detect anomalies, and form hypotheses. This phase is crucial for understanding the dataset’s structure and characteristics, making it the most appropriate description of the current activities. Tasks like calculating statistics and visualizing data are fundamental to EDA, helping to uncover patterns, detect outliers, and gain insights into the data before any modeling is done. EDA serves as the foundation for building predictive models by providing a deep understanding of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Data Preparation

A

data preparation involves cleaning and preprocessing the data to make it suitable for analysis or modeling. This may include handling missing values, removing duplicates, or transforming variables, but it does not typically involve calculating statistics and visualizing data. While data preparation is an important step, it does not encompass the exploratory analysis activities described in the question.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Data Augmentation

A

Data augmentation is a technique used primarily in machine learning to artificially increase the size and variability of the training dataset by creating modified versions of the existing data, such as flipping images or adding noise. It is not related to the tasks of calculating statistics or visualizing data, which are part of EDA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Model Evaluation

A

Model evaluation refers to assessing the performance of a machine learning model using specific metrics such as accuracy, precision, recall, or F1 score. Model evaluation does not involve exploratory tasks like calculating statistics or visualizing data; instead, it focuses on validating the effectiveness of a trained model. Therefore, this phase does not align with the company’s current activities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Amazon Bedrock

A

Amazon Bedrock is the easiest way to build and scale generative AI applications with foundation models. Amazon Bedrock is a fully managed service that makes foundation models from Amazon and leading AI startups available through an API, so you can choose from various FMs to find the model that’s best suited for your use case. With Bedrock, you can speed up developing and deploying scalable, reliable, and secure generative AI applications without managing infrastructure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Amazon SageMaker JumpStart

A

Amazon SageMaker JumpStart is a machine learning hub with foundation models, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks. With SageMaker JumpStart, you can access pre-trained models, including foundation models, to perform tasks like article summarization and image generation. Pretrained models are fully customizable for your use case with your data, and you can easily deploy them into production with the user interface or SDK.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Amazon Q

A

Amazon Q is a generative AI–powered assistant for accelerating software development and leveraging companies’ internal data. Amazon Q generates code, tests, and debugs. It has multistep planning and reasoning capabilities that can transform and implement new code generated from developer requests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

AWS Trainium

A

AWS Trainium is the machine learning (ML) chip that AWS purpose-built for deep learning (DL) training of 100B+ parameter models. Each Amazon Elastic Compute Cloud (Amazon EC2) Trn1 instance deploys up to 16 Trainium accelerators to deliver a high-performance, low-cost solution for DL training in the cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

AWS Inferentia

A

AWS Inferentia is an ML chip purpose-built by AWS to deliver high-performance inference at a low cost. AWS Inferentia accelerators are designed by AWS to deliver high performance at the lowest cost in Amazon EC2 for your deep learning (DL) and generative AI inference applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Foundation Models

A

FMs use self-supervised learning to create labels from input data, however, fine-tuning an FM is a supervised learning process

In supervised learning, you train the model with a set of input data and a corresponding set of paired labeled output data. Unsupervised machine learning is when you give the algorithm input data without any labeled output data. Then, on its own, the algorithm identifies patterns and relationships in and between the data. Self-supervised learning is a machine learning approach that applies unsupervised learning methods to tasks usually requiring supervised learning. Instead of using labeled datasets for guidance, self-supervised models create implicit labels from unstructured data.

Foundation models use self-supervised learning to create labels from input data. This means no one has instructed or trained the model with labeled training data sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Fine-tuning

A

Fine-tuning a pre-trained foundation model is an affordable way to take advantage of their broad capabilities while customizing a model on your own small, corpus. Fine-tuning involves further training a pre-trained language model on a specific task or domain-specific dataset, allowing it to address business requirements. Fine-tuning is a customization method that does change the weights of your model.

Fine-tuning an FM is a supervised learning process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Shapley values

A

Shapley values provide a local explanation by quantifying the contribution of each feature to the prediction for a specific instance

Use Shapley values to explain individual predictions

Shapley values are a local interpretability method that explains individual predictions by assigning each feature a contribution score based on its marginal effect on the prediction. This method is useful for understanding the impact of each feature on a specific instance’s prediction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Partial Dependence Plots (PDP)

A

PDP provides a global explanation by showing the marginal effect of a feature on the model’s predictions across the dataset.

PDP to understand the model’s behavior at a dataset level

Partial Dependence Plots (PDP), on the other hand, provide a global view of the model’s behavior by illustrating how the predicted outcome changes as a single feature is varied across its range, holding all other features constant. PDPs help understand the overall relationship between a feature and the model output across the entire dataset.

39
Q

Amazon Q in QuickSight

A

With Amazon Q in QuickSight, customers get a generative BI assistant that allows business analysts to use natural language to build BI dashboards in minutes and easily create visualizations and complex calculations. These dashboard-authoring capabilities empower business analysts to swiftly build, uncover, and share valuable insights using natural language prompts. You can simplify data understanding for business users through a context-aware Q&A experience, executive summaries, and customizable data stories — all designed to use insights to inform and drive decisions.

40
Q

Amazon Q Developer

A

Amazon Q Developer assists developers and IT professionals with all their tasks—from coding, testing, and upgrading applications, to diagnosing errors, performing security scanning and fixes, and optimizing AWS resources.

41
Q

Amazon Q Business

A

Amazon Q Business is a fully managed, generative-AI-powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data. It allows end users to receive immediate, permissions-aware responses from enterprise data sources with citations, for use cases such as IT, HR, and benefits help desks.

42
Q

Amazon Q in Connect

A

Amazon Connect is the contact center service from AWS. Amazon Q helps customer service agents provide better customer service. Amazon Q in Connect enriches real-time customer conversations with the relevant company content. It recommends what to say or what actions an agent should take to assist customers in a better way.

43
Q

Confusion matrix

A

Confusion matrix is a tool specifically designed to evaluate the performance of classification models by displaying the number of true positives, true negatives, false positives, and false negatives. This matrix provides a detailed breakdown of the model’s performance across all classes, making it the most suitable choice for evaluating a classification model’s accuracy and identifying potential areas for improvement. It provides a comprehensive overview of the model’s performance by detailing how many instances were correctly or incorrectly classified in each category. This enables the company to understand where the model is performing well and where it may need adjustments, such as improving the classification of specific material types.

44
Q

Root Mean Squared Error (RMSE)

A

Root Mean Squared Error (RMSE) is a metric commonly used to measure the average error in regression models by calculating the square root of the average squared differences between predicted and actual values. However, RMSE is not suitable for classification tasks, as it is designed to measure continuous outcomes, not discrete class predictions.

45
Q

Mean Absolute Error (MAE)

A

Mean Absolute Error (MAE) measures the average magnitude of errors in a set of predictions without considering their direction. MAE is typically used in regression tasks to quantify the accuracy of a continuous variable’s predictions, not for classification tasks where the outputs are categorical rather than continuous.

46
Q

Correlation matrix

A

Correlation matrix measures the statistical correlation between different variables or features in a dataset, typically used to understand the relationships between continuous variables. A correlation matrix is not designed to evaluate the performance of a classification model, as it does not provide any insight into the accuracy or errors of categorical predictions.

47
Q

Transformer models

A

Transformer models use a self-attention mechanism and implement contextual embeddings

Transformer models are a type of neural network architecture designed to handle sequential data, such as language, in an efficient and scalable way. They rely on a mechanism called self-attention to process input data, allowing them to understand and generate language effectively. Self-attention allows the model to weigh the importance of different words in a sentence when encoding a particular word. This helps the model capture relationships and dependencies between words, regardless of their position in the sequence.

Transformer models use self-attention to weigh the importance of different words in a sentence, allowing them to capture complex dependencies. Positional encodings provide information about word order, and the encoder-decoder architecture enables effective processing and generation of sequences. This makes transformers highly effective for tasks like language translation, text generation, and more.

48
Q

Generative Adversarial Networks (GANs)

A

Generative Adversarial Networks (GANs) work by training two neural networks in a competitive manner. The first network, known as the generator, generates fake data samples by adding random noise. The second network, called the discriminator, tries to distinguish between real data and the fake data produced by the generator.

49
Q

Variational autoencoders (VAEs)

A

Variational autoencoders (VAEs) learn a compact representation of data called latent space. You can think of it as a unique code representing the data based on all its attributes. VAEs use two neural networks—the encoder and the decoder. The encoder neural network maps the input data to a mean and variance for each dimension of the latent space. The decoder neural network takes this sampled point from the latent space and reconstructs it back into data that resembles the original input.

50
Q

Diffusion models

A

Diffusion models work by first corrupting data with noise through a forward diffusion process and then learning to reverse this process to denoise the data. They use neural networks to predict and remove the noise step by step, ultimately generating new, structured data from random noise.

51
Q

reinforcement learning (RL)

A

Reinforcement learning is the most suitable approach for self-improvement in this context. By leveraging RL, the chatbot can learn from customer interactions in real-time. Positive customer feedback serves as a reward signal that guides the chatbot to improve its responses over time. The chatbot adapts its behavior based on rewards or penalties, refining its conversational skills through continuous feedback loops. This dynamic learning process is effective for environments where responses need to be optimized based on direct user interaction and satisfaction.

52
Q

supervised learning

A

supervised learning can be effective for training chatbots with labeled data (such as examples of positive and negative customer interactions)

Supervised learning requires extensive datasets and retraining the model whenever new data is available, making it less adaptive in real-time environments.

53
Q

Incremental training

A

Incremental training allows a model to update itself with new data while retaining knowledge from old data. However, it may not be sufficient for optimizing chatbot performance in real-time, especially without incorporating direct feedback signals like those in reinforcement learning. Incremental learning is less dynamic than reinforcement learning and may struggle to keep up with fast-changing customer preferences or conversation styles.

54
Q

Transfer learning

A

Transfer learning is used when a model trained in one domain or task can benefit from applying its knowledge to a different but related domain. While transfer learning can improve chatbot performance by leveraging pre-trained models, it does not provide the framework for continuous, self-improvement based on ongoing customer interactions. Therefore, it is not the most effective approach for a chatbot seeking to improve through real-time conversations.

55
Q

Epochs

A

One epoch is one cycle through the entire dataset. Multiple intervals complete a batch, and multiple batches eventually complete an epoch. Multiple epochs are run until the accuracy of the model reaches an acceptable level, or when the error rate drops below an acceptable level.

Increasing the number of epochs allows the model to learn from the training data for a longer period, potentially capturing more complex patterns and relationships, which can improve accuracy. Multiple epochs are run until the accuracy of the model reaches an acceptable level, or when the error rate drops below an acceptable level.

56
Q

Learning rate

A

The amount that values should be changed between epochs. As the model is refined, its internal weights are being nudged and error rates are checked to see if the model improves. A typical learning rate is 0.1 or 0.01, where 0.01 is a much smaller adjustment and could cause the training to take a long time to converge, whereas 0.1 is much larger and can cause the training to overshoot. It is one of the primary hyperparameters that you might adjust for training your model. Note that for text models, a much smaller learning rate (5e-5 for BERT) can result in a more accurate model.

57
Q

Batch size

A

The number of records from the dataset that is to be selected for each interval to send to the GPUs for training.

58
Q

Regularization

A

Regularization helps prevent linear models from overfitting training data examples (that is, memorizing patterns instead of generalizing them) by penalizing extreme weight values.

Increasing regularization is beneficial when the model is overfitting, as it adds constraints that penalize complexity, encouraging the model to generalize better. However, if the model is already underfitting (not capturing the patterns in the data well), increasing regularization could further decrease its performance, and it might not improve accuracy.

59
Q

L1 regularization

A

L1 regularization has the effect of reducing the number of features used in the model by pushing to zero the weights of features that would otherwise have small weights. As a result, L1 regularization results in sparse models and reduces the amount of noise in the model.

60
Q

L2 regularization

A

L2 regularization results in smaller overall weight values, and stabilizes the weights when there is high correlation between the input features.

61
Q

MLflow with Amazon SageMaker

A

Manage machine learning experiments

Machine learning is an iterative process that requires experimenting with various combinations of data, algorithms, and parameters while observing their impact on model accuracy. The iterative nature of ML experimentation results in numerous model training runs and versions, making it challenging to track the best-performing models and their configurations.

Use MLflow with Amazon SageMaker to track, organize, view, analyze, and compare iterative ML experimentation to gain comparative insights and register and deploy your best-performing models.

62
Q

Large Language Model (LLM)

A

Large language models (LLMs) are a class of Foundation Models (FMs). For example, OpenAI’s generative pre-trained transformer (GPT) models are LLMs. LLMs are specifically focused on language-based tasks such as such as summarization, text generation, classification, open-ended conversation, and information extraction.

63
Q

Retrieval-Augmented Generation (RAG)

A

Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Large Language Models (LLMs) are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. RAG extends the already powerful capabilities of LLMs to specific domains or an organization’s internal knowledge base, all without the need to retrain the model. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts.

Depending on the configuration, Amazon Q Business web application workflow can use LLM/RAG or both.

64
Q

Diffusion Model

A

Diffusion models create new data by iteratively making controlled random changes to an initial data sample. They start with the original data and add subtle changes (noise), progressively making it less similar to the original. This noise is carefully controlled to ensure the generated data remains coherent and realistic. After adding noise over several iterations, the diffusion model reverses the process. Reverse denoising gradually removes the noise to produce a new data sample that resembles the original.

65
Q

Generative adversarial network (GAN)

A

GANs work by training two neural networks in a competitive manner. The first network, known as the generator, generates fake data samples by adding random noise. The second network, called the discriminator, tries to distinguish between real data and the fake data produced by the generator. During training, the generator continually improves its ability to create realistic data while the discriminator becomes better at telling real from fake. This adversarial process continues until the generator produces data that is so convincing that the discriminator can’t differentiate it from real data.

66
Q

Variational autoencoders (VAE)

A

VAEs use two neural networks—the encoder and the decoder. The encoder neural network maps the input data to a mean and variance for each dimension of the latent space. It generates a random sample from a Gaussian (normal) distribution. This sample is a point in the latent space and represents a compressed, simplified version of the input data. The decoder neural network takes this sampled point from the latent space and reconstructs it back into data that resembles the original input.

67
Q

Amazon Bedrock Guardrails:

A

The company should instruct the model to stick to the prompt by adding explicit instructions to ignore any unrelated or potentially malicious content

This is the correct approach because providing explicit instructions within the prompt helps guide the model’s behavior, reducing the likelihood of generating inappropriate or unsafe content. By clarifying what the model should focus on and what it should ignore, the company can enforce boundaries that align with its safety standards. This method is straightforward and leverages prompt engineering to mitigate risks effectively.

68
Q

Domain Adaptation Fine-Tuning

A

Domain Adaptation Fine-Tuning is an effective approach because it takes a pre-trained Foundation Model and further adjusts its parameters using domain-specific data. This process helps the model learn the nuances, terminology, and context specific to the domain, enhancing its ability to generate accurate and relevant outputs in that field. Fine-tuning allows the model to specialize while retaining the general knowledge acquired during initial training.

69
Q

Continued Pre-Training

A

Continued Pre-Training is another appropriate strategy for making a Foundation Model an expert in a specific domain. By pre-training the model on a large dataset specifically from the target domain, the model can learn the distinct characteristics, language patterns, and specialized knowledge relevant to that domain. This approach effectively builds upon the model’s existing knowledge, enhancing its domain expertise without starting training from scratch.

70
Q

Amazon OpenSearch Service

A

Amazon OpenSearch Service, which is designed to provide fast search capabilities and supports full-text search, indexing, and similarity scoring

Amazon OpenSearch Service is the most suitable choice because it is specifically built to handle search and analytics workloads, including fast index lookups and similarity scoring. OpenSearch supports full-text search, vector search, and advanced data indexing, which are essential for the Retrieval-Augmented Generation (RAG) framework. It enables the chatbot or model to quickly find and rank relevant documents based on their similarity to the query, making it highly effective for applications that require rapid data retrieval and relevance ranking.

71
Q

Amazon Aurora

A

Amazon Aurora is a high-performance relational database service that is excellent for OLTP (Online Transaction Processing) workloads. While it provides advanced indexing features for relational data, it is not optimized for full-text search, fast similarity lookups, or the types of search capabilities required for RAG applications. Aurora’s primary strengths lie in transactional integrity and scalability for relational datasets, not in search and retrieval tasks.

72
Q

Amazon DocumentDB (with MongoDB compatibility)

A

Amazon DocumentDB is primarily designed for storing and querying semi-structured JSON data. While it provides scalability and managed support for document-based workloads, it is not optimized for full-text search or similarity searches. DocumentDB lacks the native capabilities for efficient indexing and retrieval needed for RAG, making it a less suitable choice.

73
Q

Amazon DynamoDB

A

Amazon DynamoDB is a key-value and document database designed for fast and predictable performance with low latency, suitable for high-throughput transactional workloads. However, it does not natively support advanced search capabilities or similarity scoring needed for RAG applications. Its primary focus is on rapid data retrieval based on primary keys, not on the complex search and retrieval functions required for this scenario.

74
Q

Amazon Bedrock

A

Amazon Bedrock is a fully managed service that makes high-performing foundation models (FMs) from leading AI startups and Amazon available for your use through a unified API. Using Amazon Bedrock, you can easily experiment with and evaluate top foundation models for your use cases, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources.

With Amazon Bedrock, you can privately customize FMs, retaining control over how your data is used and encrypted. Amazon Bedrock makes a separate copy of the base FM and trains this private copy of the model. Your data includes prompts, information used to supplement a prompt, and FM responses. Customized FMs remain in the Region where the API call is processed.

With Amazon Bedrock, your data, including prompts and customized foundation models, stays within the AWS Region where the API call is processed and encrypted in transit as well as at rest. You can use AWS PrivateLink to ensure private connectivity between your models and on-premises networks without exposing traffic to the internet.

75
Q

AWS Trainium

A

AWS Trainium instances are designed with energy efficiency in mind, providing optimal performance per watt for machine learning workloads. Trainium, AWS’s custom-designed machine learning chip, is specifically engineered to offer the best performance at the lowest power consumption, reducing the carbon footprint of training large-scale models. This makes Trainium instances the most environmentally friendly choice among the options listed. Trn1 instances powered by Trainium are up to 25% more energy efficient for DL training than comparable accelerated computing EC2 instances.

76
Q

Accelerated Computing P type instances (EC2)

A

Accelerated Computing P type instances, powered by high-end GPUs like NVIDIA Tesla, are optimized for maximum computational throughput, particularly for machine learning and HPC tasks. However, they consume significant amounts of power and are not specifically designed with energy efficiency in mind, making them less suitable for an environmentally conscious choice.

77
Q

Accelerated Computing G type instances (EC2)

A

Accelerated Computing G type instances, such as those powered by NVIDIA GPUs, are designed for graphics-heavy applications like gaming, rendering, or video processing. While they offer high computational power for specific tasks, they are not specifically optimized for energy efficiency or low environmental impact, making them less suitable for a company focused on minimizing its carbon footprint.

78
Q

Compute Optimized C type instances (EC2)

A

Compute Optimized C type instances are designed to maximize compute performance for applications such as web servers, gaming, and scientific modeling. While they provide excellent compute power, they are not optimized for energy efficiency in the same way as AWS Trainium instances, making them less ideal for reducing environmental impact.

79
Q

On-demand pricing

A

The company should opt for on-demand pricing, which allows it to pay only for the actual usage of resources without any long-term commitments

On-demand pricing is the most appropriate option for a company that is uncertain about the time commitment or extent of its usage. This pricing model allows the company to pay for Amazon Bedrock services based on actual usage without requiring any upfront payment or long-term contract. It provides flexibility and scalability, making it suitable for organizations that need to adapt their usage according to evolving needs or have unpredictable workloads.

80
Q

Provisioned throughput

A

Provisioned throughput is less suitable in this scenario because it is designed for situations where the usage is consistent and predictable. This model involves committing to a certain level of capacity, which may lead to unnecessary costs if the actual usage is lower than anticipated. Since the company lacks clarity on its time commitment and usage patterns, provisioned throughput does not offer the flexibility needed.

81
Q

Spot Instances

A

Spot Instances are a pricing model offered by AWS for EC2 compute instances, which allows you to bid for spare EC2 capacity at reduced rates. Spot instances can be interrupted by AWS with little notice. This is not applicable as a pricing model for Amazon Bedrock. This option just acts as a distractor.

82
Q

Reserved Instances

A

Reserved Instances offer a lower rate for EC2 compute resources in exchange for a one- or three-year commitment. This is not applicable as a pricing model for Amazon Bedrock. This option just acts as a distractor.

83
Q

Amazon SageMaker Ground Truth

A

To train a machine learning model, you need a large, high-quality, labeled dataset. Ground Truth helps you build high-quality training datasets for your machine learning models. With Ground Truth, you can use workers from either Amazon Mechanical Turk, a vendor company that you choose, or an internal, private workforce along with machine learning to enable you to create a labeled dataset. You can use the labeled dataset output from Ground Truth to train your models. You can also use the output as a training dataset for an Amazon SageMaker model.

Depending on your ML application, you can choose from one of the Ground Truth built-in task types to have workers generate specific types of labels for your data. You can also build a custom labeling workflow to provide your UI and tools to workers labeling your data. You can choose your workforce from:

  1. The Amazon Mechanical Turk workforce of over 500,000 independent contractors worldwide.
  2. A private workforce that you create from your employees or contractors for handling data within your organization.
  3. A vendor company that you can find in the AWS Marketplace that specializes in data labeling services.
84
Q

Amazon SageMaker Feature Store

A

Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference. For example, in an application that recommends a music playlist, features could include song ratings, listening duration, and listener demographics.

85
Q

Amazon SageMaker JumpStart

A

Amazon SageMaker JumpStart is a machine learning (ML) hub that can help you accelerate your ML journey. With SageMaker JumpStart, you can evaluate, compare, and select Foundation Models (FMs) quickly based on pre-defined quality and responsibility metrics to perform tasks like article summarization and image generation. Pretrained models are fully customizable for your use case with your data, and you can easily deploy them into production with the user interface or SDK.

86
Q

Amazon SageMaker Canvas

A

SageMaker Canvas offers a no-code interface that can be used to create highly accurate machine learning models —without any machine learning experience or writing a single line of code. SageMaker Canvas provides access to ready-to-use models including foundation models from Amazon Bedrock or Amazon SageMaker JumpStart or you can build your custom ML model using AutoML powered by SageMaker AutoPilot.

87
Q

Amazon Rekognition

A

Amazon Rekognition is a cloud-based image and video analysis service that makes it easy to add advanced computer vision capabilities to your applications. The service is powered by proven deep learning technology and it requires no machine learning expertise to use. Amazon Rekognition includes a simple, easy-to-use API that can quickly analyze any image or video file that’s stored in Amazon S3.

You can add features that detect objects, text, and unsafe content, analyze images/videos, and compare faces to your application using Rekognition’s APIs. With Amazon Rekognition’s face recognition APIs, you can detect, analyze, and compare faces for a wide variety of use cases, including user verification, cataloging, people counting, and public safety.

Amazon Rekognition offers pre-trained and customizable computer vision (CV) capabilities to extract information and insights from your images and videos.

88
Q

Amazon SageMaker

A

Amazon SageMaker is a fully managed machine learning (ML) service. With SageMaker, data scientists and developers can quickly and confidently build, train, and deploy ML models into a production-ready hosted environment. It provides a UI experience for running ML workflows that makes SageMaker ML tools available across multiple integrated development environments (IDEs).

89
Q

Amazon DeepRacer

A

AWS DeepRacer is an autonomous 1/18th scale race car designed to test RL models by racing on a physical track. Using cameras to view the track and a reinforcement model to control throttle and steering, the car shows how a model trained in a simulated environment can be transferred to the real world.

90
Q

Amazon Textract

A

Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, layout elements, and data from scanned documents. It goes beyond simple optical character recognition (OCR) to identify, understand, and extract specific data from documents.

91
Q

Provisioned Throughput mode

A

The company should use Provisioned Throughput mode, which allows the company to reserve a specific amount of capacity in advance

With fine-tuning, you can increase model accuracy by providing your own task-specific labeled training dataset and further specialize your FMs. With continued pre-training, you can train models using your own unlabeled data in a secure and managed environment with customer managed keys. Continued pre-training helps models become more domain-specific by accumulating more robust knowledge and adaptability—beyond their original training.

Once the fine-tuning job is complete, you receive a unique model ID for your custom model. Your fine-tuned model is stored securely by Amazon Bedrock. To test and deploy your model, you need to purchase Provisioned Throughput. This mode is designed for situations where there is a predictable, continuous workload, such as the intensive compute required during the fine-tuning phase.

Exam Alert:

For testing and deploy customized models for Amazon Bedrock (via fine-tuning or continued pre-training), it is mandatory to use Provisioned Throughput.

92
Q

batch inference

A

With batch inference, you can run multiple inference requests asynchronously to process a large number of requests efficiently by running inference on data that is stored in an S3 bucket. You can use batch inference to improve the performance of model inference on large datasets.

You cannot use batch inference to facilitate fine-tuning of the model. This option acts as a distractor.

93
Q

Knowledge Bases for Amazon Bedrock

A

With Knowledge Bases for Amazon Bedrock, you can give FMs and agents contextual information from your company’s private data sources for RAG to deliver more relevant, accurate, and customized responses

Knowledge Bases for Amazon Bedrock takes care of the entire ingestion workflow of converting your documents into embeddings (vector) and storing the embeddings in a specialized vector database. Knowledge Bases for Amazon Bedrock supports popular databases for vector storage, including vector engine for Amazon OpenSearch Serverless, Pinecone, Redis Enterprise Cloud, Amazon Aurora (coming soon), and MongoDB (coming soon). If you do not have an existing vector database, Amazon Bedrock creates an OpenSearch Serverless vector store for you.

94
Q

Watermark detection for Amazon Bedrock

A

The watermark detection mechanism allows you to identify images generated by Amazon Titan Image Generator, a foundation model that allows users to create realistic, studio-quality images in large volumes and at low cost, using natural language prompts. With watermark detection, you can increase transparency around AI-generated content by mitigating harmful content generation and reducing the spread of misinformation. You cannot use a watermark detection mechanism to implement RAG workflow in Amazon Bedrock.

95
Q

Guardrails for Amazon Bedrock

A

Guardrails for Amazon Bedrock help you implement safeguards for your generative AI applications based on your use cases and responsible AI policies. It helps control the interaction between users and FMs by filtering undesirable and harmful content, redacts personally identifiable information (PII), and enhances content safety and privacy in generative AI applications. You cannot use guardrails to implement RAG workflow in Amazon Bedrock.