AI Practice Test #1.5 Flashcards
Asynchronous inference
Asynchronous inference is the most suitable choice for this scenario. It allows the company to process smaller payloads without requiring real-time responses by queuing the requests and handling them in the background. This method is cost-effective and efficient when some delay is acceptable, as it frees up resources and optimizes compute usage. Asynchronous inference is ideal for scenarios where the payload size is less than 1 GB and immediate results are not critical.
Batch inference
Batch inference is generally used for processing large datasets all at once. While it does not require immediate responses, it is typically more efficient for handling larger payloads (several gigabytes or more). For smaller payloads of less than 1 GB, batch inference might be overkill and less cost-efficient compared to asynchronous inference.
Real-time inference
Real-time inference is optimized for scenarios where low latency is essential, and responses are needed immediately. It is not suitable for cases where the system can afford to wait for responses, as it might lead to higher costs and resource consumption without providing any additional benefit for this particular use case.
Serverless inference
Serverless inference is a good choice for workloads with unpredictable traffic or sporadic requests, as it scales automatically based on demand. However, it may not be as cost-effective for scenarios where workloads are predictable, and some waiting time is acceptable. Asynchronous inference provides a more targeted solution for handling delayed responses at a lower cost.
What are the key constituents of a good prompting technique in this context?
(1) Instructions – a task for the model to do (description, how the model should perform)
(2) Context – external information to guide the model
(3) Input data – the input for which you want a response
(4) Output Indicator – the output type or format
Hyperparameters
Hyperparameters are values that can be adjusted for model customization to control the training process and, consequently, the output custom model. In other words, hyperparameters are external configurations set before the training process begins. They control the training process and the structure of the model but are not adjusted by the training algorithm itself. Examples include the learning rate, the number of layers in a neural network, etc.
Model parameters
Model parameters are values that define a model and its behavior in interpreting input and generating responses. Model parameters are controlled and updated by providers. You can also update model parameters to create a new model through the process of model customization. In other words, Model parameters are the internal variables of the model that are learned and adjusted during the training process. These parameters directly influence the output of the model for a given input. Examples include the weights and biases in a neural network.
Few-shots prompting
The data should include user-input along with the correct user intent, providing examples of user queries and the corresponding intent
This is the correct answer because few-shots prompting involves providing the model with examples that include both the user-input and the correct user intent. These examples help the model understand and learn how to map various user queries to their appropriate intents. By repeatedly seeing this pairing, the model can generalize from these examples and improve its ability to recognize user intent in new, unseen queries.
Retrieval-Augmented Generation (RAG)
Utilize a Retrieval-Augmented Generation (RAG) system by indexing all product catalog PDFs and configuring the LLM chatbot to reference this system for answering queries
Using a RAG approach is the least costly and most efficient solution for providing up-to-date and relevant responses. In this approach, you convert all product catalog PDFs into a searchable knowledge base. When a customer query comes in, the RAG framework first retrieves the most relevant pieces of information from this knowledge base and then uses an LLM to generate a coherent response based on the retrieved context. This method does not require re-training the model or modifying every incoming query with large datasets, making it significantly more cost-effective. It ensures that the chatbot always has access to the most recent information without needing expensive updates or processing every time.
Stable Diffusion
Stable Diffusion is a generative artificial intelligence (generative AI) model that produces unique photorealistic images from text and image prompts.
Llama
Llama is a series of large language models trained on publicly available data. They are built on the transformer architecture, enabling them to handle input sequences of any length and produce output sequences of varying lengths. A notable feature of Llama models is their capacity to generate coherent and contextually appropriate text.
Jurassic
Jurassic family of models from AI21 Labs supported use cases such as question answering, summarization, draft generation, advanced information extraction, and ideation for tasks requiring intricate reasoning and logic.
Claude
Claude is Anthropic’s frontier, state-of-the-art large language model that offers important features for enterprises like advanced reasoning, vision analysis, code generation, and multilingual processing.
Amazon Comprehend
Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to uncover insights and relationships in text. It is specifically designed for tasks such as sentiment analysis, entity recognition, key phrase extraction, and language detection. For the scenario of analyzing customer reviews, Amazon Comprehend can directly determine the overall sentiment of a text (positive, negative, neutral, or mixed), making it the ideal service for this purpose. By using Amazon Comprehend, e-commerce platforms can effectively analyze customer feedback, understand customer satisfaction levels, and identify common themes or concerns.
Amazon Bedrock
Amazon Bedrock is an AI service that provides access to foundation models (large language models, including those for NLP tasks) via an API. While Amazon Bedrock is not specifically an NLP service like Amazon Comprehend, it can be used to fine-tune pre-trained foundation models for various tasks, including sentiment analysis. With the proper configuration and fine-tuning, Bedrock can analyze text data to determine sentiment, making it a versatile option for advanced users who may need more customizable solutions than Amazon Comprehend.
Amazon Rekognition
Amazon Rekognition is a service designed for analyzing images and videos, not text. It can identify objects, people, text within images, and even detect inappropriate content in images and videos. However, it does not provide any capabilities for natural language processing or sentiment analysis, making it unsuitable for analyzing written customer reviews.
Amazon Textract
Amazon Textract is an OCR (Optical Character Recognition) service that extracts printed or handwritten text from scanned documents, PDFs, and images. It is useful for digitizing text but does not offer any features for analyzing or interpreting the sentiment of the extracted text. Since Textract focuses on text extraction rather than understanding or analyzing the content, it is not suitable for sentiment analysis tasks.
Amazon Personalize
Amazon Personalize is a service that provides personalized recommendations, search, and ranking for websites and applications based on user behavior and preferences. While it can help improve customer experience by suggesting products or content based on historical data, it does not offer natural language processing or sentiment analysis capabilities. Thus, it is not the correct choice for analyzing written customer reviews to determine sentiment.
Model invocation logging
The company should enable model invocation logging, which allows for detailed logging of all requests and responses during model invocations in Amazon Bedrock
You can use model invocation logging to collect invocation logs, model input data, and model output data for all invocations in your AWS account used in Amazon Bedrock. With invocation logging, you can collect the full request data, response data, and metadata associated with all calls performed in your account. Logging can be configured to provide the destination resources where the log data will be published. Supported destinations include Amazon CloudWatch Logs and Amazon Simple Storage Service (Amazon S3). Only destinations from the same account and region are supported. Model invocation logging is disabled by default.
This is the correct option because enabling invocation logging on Amazon Bedrock allows the company to capture detailed logs of all model requests and responses, including input data, output predictions, and any errors that occur during model execution. This method provides comprehensive monitoring capabilities, enabling the company to effectively track, audit, and troubleshoot model performance and usage.
AWS CloudTrail
While AWS CloudTrail is useful for tracking API calls and monitoring who accessed which AWS resources, it does not capture the actual input and output data involved in model invocations. CloudTrail logs are primarily intended for auditing access and managing security rather than monitoring detailed data flow or model performance on Amazon Bedrock.
Amazon EventBridge
Amazon EventBridge is designed to react to changes and events across AWS resources and trigger workflows or automate responses. Although it can track when a model invocation occurs, it does not provide detailed logging of the input and output data associated with these invocations, limiting its usefulness for comprehensive monitoring purposes.
AWS Config
AWS Config is specifically designed for monitoring and managing AWS resource configurations and compliance, not for tracking or logging the input and output data of machine learning models on Amazon Bedrock. AWS Config focuses on configuration management and does not provide the level of detail required to monitor data traffic or model performance in machine learning applications.
Generative Adversarial Network (GAN)
The company should use a Generative Adversarial Network (GAN) for creating realistic synthetic data while preserving the statistical properties of the original data
This is the correct answer because GANs are specifically designed for generating synthetic data that is statistically similar to real data. They consist of two neural networks—a generator and a discriminator—that work against each other to create highly realistic synthetic data. GANs have been successfully used in various domains, including image generation, text synthesis, and more, to produce data that retains the underlying patterns and structures of the original dataset, making them highly suitable for this purpose.
Support Vector Machines (SVMs)
SVMs are used for classification and regression, where the algorithm finds the optimal hyperplane that best separates different classes in the data. SVMs do not generate new data or create synthetic datasets, so they are not suitable for a task that requires generating synthetic data based on existing datasets.
Convolutional Neural Network (CNN)
CNNs are designed for tasks such as image and video recognition, object detection, and similar applications involving grid-like data (such as pixels in an image). While CNNs are excellent at feature extraction and classification in images, they are not suitable for generating synthetic data, especially for non-visual data types.
WaveNet
WaveNet is tailored for audio data generation, specifically for tasks such as speech synthesis and audio signal processing. While it is powerful within its specific domain, it is not designed for generating synthetic data outside of audio, making it an unsuitable choice for general-purpose data generation.
Exploratory Data Analysis (EDA)
The company is in the Exploratory Data Analysis (EDA) phase, which involves examining the data through statistical summaries and visualizations to identify patterns, detect anomalies, and form hypotheses. This phase is crucial for understanding the dataset’s structure and characteristics, making it the most appropriate description of the current activities. Tasks like calculating statistics and visualizing data are fundamental to EDA, helping to uncover patterns, detect outliers, and gain insights into the data before any modeling is done. EDA serves as the foundation for building predictive models by providing a deep understanding of the data.
Data Preparation
data preparation involves cleaning and preprocessing the data to make it suitable for analysis or modeling. This may include handling missing values, removing duplicates, or transforming variables, but it does not typically involve calculating statistics and visualizing data. While data preparation is an important step, it does not encompass the exploratory analysis activities described in the question.
Data Augmentation
Data augmentation is a technique used primarily in machine learning to artificially increase the size and variability of the training dataset by creating modified versions of the existing data, such as flipping images or adding noise. It is not related to the tasks of calculating statistics or visualizing data, which are part of EDA.
Model Evaluation
Model evaluation refers to assessing the performance of a machine learning model using specific metrics such as accuracy, precision, recall, or F1 score. Model evaluation does not involve exploratory tasks like calculating statistics or visualizing data; instead, it focuses on validating the effectiveness of a trained model. Therefore, this phase does not align with the company’s current activities.
Amazon Bedrock
Amazon Bedrock is the easiest way to build and scale generative AI applications with foundation models. Amazon Bedrock is a fully managed service that makes foundation models from Amazon and leading AI startups available through an API, so you can choose from various FMs to find the model that’s best suited for your use case. With Bedrock, you can speed up developing and deploying scalable, reliable, and secure generative AI applications without managing infrastructure.
Amazon SageMaker JumpStart
Amazon SageMaker JumpStart is a machine learning hub with foundation models, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks. With SageMaker JumpStart, you can access pre-trained models, including foundation models, to perform tasks like article summarization and image generation. Pretrained models are fully customizable for your use case with your data, and you can easily deploy them into production with the user interface or SDK.
Amazon Q
Amazon Q is a generative AI–powered assistant for accelerating software development and leveraging companies’ internal data. Amazon Q generates code, tests, and debugs. It has multistep planning and reasoning capabilities that can transform and implement new code generated from developer requests.
AWS Trainium
AWS Trainium is the machine learning (ML) chip that AWS purpose-built for deep learning (DL) training of 100B+ parameter models. Each Amazon Elastic Compute Cloud (Amazon EC2) Trn1 instance deploys up to 16 Trainium accelerators to deliver a high-performance, low-cost solution for DL training in the cloud.
AWS Inferentia
AWS Inferentia is an ML chip purpose-built by AWS to deliver high-performance inference at a low cost. AWS Inferentia accelerators are designed by AWS to deliver high performance at the lowest cost in Amazon EC2 for your deep learning (DL) and generative AI inference applications.
Foundation Models
FMs use self-supervised learning to create labels from input data, however, fine-tuning an FM is a supervised learning process
In supervised learning, you train the model with a set of input data and a corresponding set of paired labeled output data. Unsupervised machine learning is when you give the algorithm input data without any labeled output data. Then, on its own, the algorithm identifies patterns and relationships in and between the data. Self-supervised learning is a machine learning approach that applies unsupervised learning methods to tasks usually requiring supervised learning. Instead of using labeled datasets for guidance, self-supervised models create implicit labels from unstructured data.
Foundation models use self-supervised learning to create labels from input data. This means no one has instructed or trained the model with labeled training data sets.
Fine-tuning
Fine-tuning a pre-trained foundation model is an affordable way to take advantage of their broad capabilities while customizing a model on your own small, corpus. Fine-tuning involves further training a pre-trained language model on a specific task or domain-specific dataset, allowing it to address business requirements. Fine-tuning is a customization method that does change the weights of your model.
Fine-tuning an FM is a supervised learning process.
Shapley values
Shapley values provide a local explanation by quantifying the contribution of each feature to the prediction for a specific instance
Use Shapley values to explain individual predictions
Shapley values are a local interpretability method that explains individual predictions by assigning each feature a contribution score based on its marginal effect on the prediction. This method is useful for understanding the impact of each feature on a specific instance’s prediction.