Exam Questions 1 Flashcards
What is the difference between model evaluation and inferencing ?
Model evaluation is the process of evaluating and comparing model outputs to determine the model that is best suited for a use case, whereas, model inference is the process of a model generating an output (response) from a given input (prompt)
What is this concept called that defines the maximum amount of text or characters the AI model can process at one time?
This concept is referred to as a context window, which determines the amount of text or information the model can consider at once while generating a response, typically measured in tokens rather than characters
Which is the default vector database supported by Knowledge Bases for Amazon Bedrock?
OpenSearch Serverless vector store
A company needs large, high-quality, and labeled datasets for training its machine learning models. Which Amazon SageMaker service helps build high-quality training datasets?
Ground Truth built-in task types to have workers generate specific types of labels for your data. You can also build a custom labeling workflow to provide your UI and tools to workers labeling your data
What is the effect of increasing the number of epochs ?
Increasing the number of epochs allows the model to learn from the training data for a longer period, potentially capturing more complex patterns and relationships, which can improve accuracy. Multiple epochs are run until the accuracy of the model reaches an acceptable level, or when the error rate drops below an acceptable level
What are the differences between Bedrock and Amazon Q ?
Amazon Q is a generative AI–powered assistant that allows you to create pre-packaged generative AI applications, whereas, Amazon Bedrock provides an environment to build and scale generative AI applications using a Foundation Model (FM)
With Amazon Bedrock, you can choose the underlying Foundation Model. However, Amazon Q does not allow you to choose the underlying Foundation Model
What is transfer learning ?
The company should use transfer learning, a method where a model pre-trained on one task is adapted to improve performance on a different but related task by leveraging knowledge from the original task
What is a key difference between Foundation Models (FMs) and Large Language Models (LLMs) in the context of generative AI?
Foundation Models serve as a broad base for various AI applications by providing generalized capabilities, whereas Large Language Models are specialized for understanding and generating human language
What AWS services does sentiment analysis ?
Transcribe and Comprehend
What setting in bedrock can determine the creativity of a response ?
Temperature higher the more creative
How do transformer models work ?
Transformer models use a self-attention mechanism and implement contextual embeddings
Transformer models are a type of neural network architecture designed to handle sequential data, such as language, in an efficient and scalable way. They rely on a mechanism called self-attention to process input data, allowing them to understand and generate language effectively. Self-attention allows the model to weigh the importance of different words in a sentence when encoding a particular word. This helps the model capture relationships and dependencies between words, regardless of their position in the sequence.
Transformer models use self-attention to weigh the importance of different words in a sentence, allowing them to capture complex dependencies. Positional encodings provide information about word order, and the encoder-decoder architecture enables effective processing and generation of sequences. This makes transformers highly effective for tasks like language translation, text generation, and more.
What is the most effective approach to implement this access control and maintain data security in Amazon Bedrock?
The company should create a service role for Amazon Bedrock for each team, granting access only to the specific team’s clients data in Amazon S3
Is a decision tree supervised or unsupervised
Supervised
In order of complexity what are the best ways to improve a models answers ?
Prompt engineering, Retrieval Augmented Generation (RAG), Fine-tuning
What are the differences between Retrieval augmented generation (RAG) and Agent in the context of Amazon Bedrock?
RAG refers to querying and retrieving information from a data source to augment a generated response to a prompt, whereas, Agent refers to an application that carries out orchestrations through cyclically interpreting inputs and producing outputs by using a foundation model
What performance metrics would you recommend to the team for evaluating the effectiveness of its classification system?
Precision, Recall and F1-Score
What type of data sets should I use to detect bias ?
The company should use benchmark datasets, which are pre-compiled, standardized datasets specifically designed to test for biases and discrimination in model outputs
In Amazon Q Business what are the sources of model responses ?
Amazon Q Business chat responses can be generated using model knowledge and enterprise data, or enterprise data only
What controls are in Amazon Q Business ?
Amazon Q Business guardrails support topic-specific controls to determine the web application environment’s behavior when it encounters a mention of a blocked topic by an end-user
Whats the difference between feature extraction and feature selection ?
Feature extraction reduces the number of features by transforming data into a new space, while feature selection reduces the number of features by selecting the most relevant ones from the existing features
What is Top K
Influences the number of most-likely candidates that the model considers for the next token from the pool of candidates. Lowering the value means the model only considers the most likely or conservative answers.
What is Top P
Influences the size of the pool that top k operates on. Lower value decreases size of the pool. Remember it is a percentage - rather than a fixed number.
When does overfitting occur ?
Overfitting occurs when the model is overly complex and captures noise or random fluctuations in the training data rather than the underlying patterns
Whats the difference between CNNs and RNNs ?
While CNNs are used for single image analysis, RNNs are used for video analysis
What is Inferencing
where the model uses its trained parameters to generate a prediction or output based on new input data provided by the user
What is a binary classification ?
binary classification problems predict a binary outcome (one of two possible classes)
What is a multiclass model ?
allow you to generate predictions for multiple classes (predict one of more than two outcomes)
What is a regression model ?
regression problems predict a numeric value
What are embeddings ?
Embeddings convert real-world objects into complex mathematical representations that capture inherent properties and relationships between real-world data.
What are embedding models ?
Embedding models are algorithms trained to encapsulate information into dense representations in a multi-dimensional space. Data scientists use embedding models to enable ML models to comprehend and reason with high-dimensional data. T
Describe bert ?
BERT is a transformer-based language model trained with massive datasets to understand languages like humans do. Like Word2Vec, BERT can create word embeddings from input data it was trained with. Additionally, BERT can differentiate contextual meanings of words when applied to different phrases. For example, BERT creates different embeddings for ‘play’ as in “I went to a play” and “I like to play.”
When in a binary classification is Precision the best metric ?
When false positives are expensive
When in a binary classification is Recall the best metric ?
When false negatives are expensive
When would I use the F1 metric
Balance between precision and recall
What metric for balanced data sets in binary classifications is best ?
Accuracy
What are the acceptable data storage formats for Amazon Q Business
PDF, XSLT, XML,HTML, MD,CSV, Excel, Word,RTF, JSON, Google Slides
Whats is the role of Bedrock agents ?
With Amazon Bedrock Agents, you can build and configure autonomous agents in your application. An agent helps your end users complete actions based on organization data and user input. Agents orchestrate interactions between FMs, data sources, software applications, and user conversations.
Whats the difference between continued pre training and fine tuning
Continued pre-training uses unlabeled data to pre-train a model, whereas, fine-tuning uses labeled data to train a model
Name some ways to cure overfitting
Pruning, Data Augmentation, Stopping, Regularisation, Ensemble
What is the difference between Shapley Values and PDP ?
Shapley values provide a local explanation by quantifying the contribution of each feature to the prediction for a specific instance, while PDP provides a global explanation by showing the marginal effect of a feature on the model’s predictions across the dataset. Use Shapley values to explain individual predictions and PDP to understand the model’s behavior at a dataset level