Flashcards
AI
_________ is a field of computer science dedicated to solving
problems that we commonly associate with human intelligence
Artificial Intelligence
Used to generate new data that is similar to the data it was trained on
* Text
* Image
* Audio
* Code
* Video…
Generative AI
To generate data, we must rely on a __________
* ___________ are trained on a wide variety of input data
* The models may cost tens of millions of dollars to train
Foundation Model
Type of AI designed to generate coherent human-like text
* One notable example: GPT-4 (ChatGPT / Open AI)
* Trained on large corpus of text data
* Usually very big models
* Billions of parameters
* Trained on books, articles, websites, other textual data
* Can perform language-related tasks
* Translation, Summarization
* Question answering * Content creatio
Large Language Models (LLM)
We usually interact with the LLM by giving a ____
prompt
What is the term for below: the generated text may be different for every user that uses
the same prompt
Non-deterministic:
What’s Amazon Titan?
- High-performing Foundation Models from AWS
- Image, text, multimodal model choices via a fully-managed APIs
- Can be customized with your own data
What term goes with this:
-Adapt a copy of a foundation model with your own data
Fine Tuning
*Improves the performance of
a pre-trained FM on domain-specific tasks
* = further trained on a
particular field or area of
knowledge
Instruction based fine tuning
-make a model expert in a specific domain
* For example: feeding the entire AWS
documentation to a model to make it an expert on AWS
* Good to feed industry-specific terminology
into a model (acronyms, etc…)
* Can continue to train the model as more
data becomes available
domain-adaptation fine-tuning
- Part of instruction-based
fine-tuning - system (optional) : context
for the conversation. - messages : An array of
message objects, each
containing: - role :
Either user or assistant - content : The text content
of the message
single turn messaging
- To provide instructionbased fine tuning for a
conversation (vs SingleTurn Messaging) - Chatbots = multi-turn
environment - You must alternate
between “user” and
“assistant” roles
multi turn messaging
True or false: Instruction-based fine-tuning is usually cheaper than re training an FM as computations are
less intense and the amount of data required usually less
true
_________ the broader concept of re-using a pre-trained model to adapt it to a new related task
* Widely used for image classification
* And for NLP (models like BERT and GPT)
transfer learning
This is a good use case of _____
- A chatbot designed with a particular persona or tone, or geared
towards a specific purpose (e.g., assisting customers, crafting
advertisements) - Training using more up-to-date information than what the language
model previously accessed - Training with exclusive data (e.g., your historical emails or messages,
records from customer service interactions) - Targeted use cases (categorization, assessing accuracy)
fine tuning
What does it mean to automatically evaluate a model?
Evaluate a model for quality control.
Scores are calculated automatically
What does it mean to have human evaluation of a model?
- Choose a work team to evaluate
- Employees of your company
- Subject-Matter Experts (SMEs)
- Define metrics and how to evaluate
- Thumbs up/down, ranking
- Curated collections of data designed specifically
at evaluating the performance of language
models - Wide range of topics, complexities, linguistic
phenomena - Helpful to measure: accuracy, speed and
efficiency, scalability
benchmark datasets
_________
* Semantic similarity between generated text
* Uses pre-trained ___ models (Bidirectional Encoder Representations from Transformers) to compare the
contextualized embeddings of both texts and computes the cosine similarity between them.
* Capable of capturing more nuance between the texts
- BERTScore
- Evaluate the quality of generated text, especially for translations
- Considers both precision and penalizes too much brevity
- Looks at a combination of n-grams (1, 2, 3, 4)
BLEU: Bilingual Evaluation Understudy
Evaluating automatic summarization and machine translation systems
* ____-N – measure the number of matching n-grams between reference and generated text
* _____–L – longest common subsequence between reference and generated text
- ROUGE: Recall-Oriented Understudy for Gisting Evaluation
- Allows a Foundation Model to reference a data source outside of its training data
- Bedrock takes care of creating Vector Embeddings in the database of your choice based on your data
- Use where real-time data is needed to be fed into the Foundation Model
- RAG = Retrieval-Augmented Generation
search & analytics database
real time similarity queries, store millions of vector embeddings
scalable index management, and fast nearest-neighbor (kNN) search capability
Amazon OpenSearch Service
[with MongoDB compatibility] – NoSQL database
real time similarity queries, store millions of vector embeddings
Amazon DocumentDB
These are examples of what use cases?
- Customer Service Chatbot
- Knowledge Base – products, features, specifications, troubleshooting guides, and FAQs
- ___application – chatbot that can answer customer queries
- Legal Research and Analysis
- Knowledge Base – laws, regulations, case precedents, legal opinions, and expert analysis
- ____Application – chatbot that can provide relevant information for specific legal queries
- Healthcare Question-Answering
- Knowledge base – diseases, treatments, clinical guidelines, research papers, patients…
- ___application – chatbot that can answer complex medical queries
RAG Use case
__________: converting raw text into a sequence of tokens
Tokenization
- The number of tokens an LLM can consider when generating text
Context Window
What is the first factor to look at when considering a model?
the context window. The larger the context window,
the more information and
coherence
- Control the interaction between users and Foundation Models (FMs)
- Filter undesirable and harmful content
- Remove Personally Identifiable Information (PII)
- Enhanced privacy
- Reduce hallucinations
guardrails
- Create vectors (array of numerical values) out of text, images or audio
- Vectors have a high dimensionality to capture many features for one input
token, such as semantic meaning, syntactic role, sentiment - _____models can power search applications
Embedding
Manage and carry out various multi-step tasks related to infrastructure
provisioning, application deployment, and operational activities
* Task coordination: perform tasks in the correct order and ensure
information is passed correctly between tasks
* _____are configured to perform specific pre-defined action groups
agents
Send logs of all invocations to Amazon
CloudWatch and S3
* Can include text, images and embeddings
* Analyze further and build alerting thanks to
CloudWatch Logs Insights
- Model Invocation Logging
- Published metrics from Bedrock to _________
- CloudWatch Metrics
_________
– give access to
Amazon Bedrock to your team so they can easily create AI
-powered
applications
- Bedrock Studio`
- _________
– check if
an image was generated by
Amazon Titan Generator
Watermark detection
What is the bedrock pricing model for image models?
charged for every image generated
What is the bedrock pricing model for embedding models?
charged for every input
token processed
What is the bedrock pricing model for text models?
charged for every input/output
token processed
Model Improvement Techniques Cost Order:
Put the cheapest at the top and the expensive at the bottom.
Prompt Engineering, Domain Adaptation Fine-tuning, Instruction-based Fine-tuning, Retrieval Augmented Generation (RAG)
- Prompt Engineering
* No model training needed (no additional computation or fine-tuning) - Retrieval Augmented Generation (RAG)
* Uses external knowledge (FM doesn’t need to ”know everything”, less complex)
* No FM changes (no additional computation or fine-tuning) - Instruction-based Fine-tuning
* FM is fine-tuned with specific instructions (requires additional computation) - Domain Adaptation Fine-tuning
* Model is trained on a domain-specific dataset (requires intensive computation)
usually a smaller model will be cheaper (T/F)
True
developing, designing, and optimizing prompts to
enhance the output of FMs for your needs
Prompt engineering
- Prompt gives a lot of guidance and leaves little into the model’s interpretation
True or false
- false, Prompt gives little guidance and leaves a lot to the model’s interpretation
what are four improved prompting techniques?
- Instructions – a task for the model to do (description, how the model should perform)
- Context – external information to guide the model
- Input data – the input for which you want a response
- Output Indicator – the output type or format
A technique where you explicitly instruct the model on what not to include
or do in its response
negative prompting
True or false. Negative prompting aims to avoid Unwanted Content – explicitly states what not to include, reducing the chances
of irrelevant or inappropriate content
true
What is temperature in prompt engineering?
creativity of the model’s output
* Low (ex: 0.2) – outputs are more conservative, repetitive, focused on most likely response
* High (ex: 1.0) – outputs are more diverse, creative, and unpredictable, maybe less coherent
_______ is how fast the model responds
prompt latency
What type of prompt engineering technique is this:
Present a task to the model
without providing examples or
explicit training for that specific task
zero shot prompting
What type of prompt engineering technique is this:
What type of prompt engineering technique is this:
few shots prompting
What type of prompt engineering technique is this:
- Divide the task into a sequence of
reasoning steps, leading to more structure and coherence - Using a sentence like “Think step by step” helps
- Helpful when solving a problem as a human usually requires several steps
Chain of Thought Prompting
What type of prompt engineering technique is this:
- Combine the model’s capability
with external data sources to
generate a more informed and
contextually rich response
Retrieval-Augmented Generation (RAG)
- Simplify and standardize the process of generating Prompts
- Helps with:
- Processes user input text and output prompts from
foundation models (FMs) - Orchestrates between the FM, action groups, and knowledge bases
- Formats and returns responses to the user
- You can also provide examples with few-shots
prompting to improve the model performance
prompt templates
- Amazon QuickSight is used to visualize your
data and create dashboards about them - Amazon Q understands natural language that
you use to ask questions about your data - Create executive summaries of your data * Ask and answer questions of data * Generate and edit visuals for your dashboards
Amazon Q for quicksight
- EC2 instances are the virtual servers you
can start in AWS - Amazon Q for EC2 provides guidance and
suggestions for EC2 instance types that are
best suited to your new workload - Can provide requirements using natural
language to get even more suggestions or
ask for advice by providing other workload
requirements
Amazon Q for EC2
- AWS Chatbot is a way for you to deploy an AWS Chatbot in a Slack
or Microsoft Teams channel that
knows about your AWS account - Troubleshoot issues, receive
notifications for alarms, security
findings, billing alerts, create support
request - You can access Amazon Q directly in
AWS Chatbot to accelerate
understanding of the AWS services,
troubleshoot issues, and identify
remediation paths
Amazon Q for AWS Chatbot
- Fully managed Gen-AI assistant for your employees
- Based on your company’s knowledge and data
- Answer questions, provide summaries, generate content, automate tasks
- Perform routine actions (e.g., submit time-off requests, send meeting invites)
- Built on Amazon Bedrock (but you can’t choose the underlying FM)
Amazon Q for business
- Create Gen AI-powered apps without coding by using natural language
- Leverages your company’s internal data
- Possibility to leverage plugins (Jira, etc…)
Amazon Q apps
- Answer questions about the AWS
documentation and AWS service selection - Answer questions about resources in your AWS
account - Suggest CLI (Command Line Interface) to run
to make changes to your account - Helps you do bill analysis, resolve errors,
troubleshooting
amazon q developer
is a broad field for the development
of intelligent systems capable of
performing tasks that typically require
human intelligence:
Artificial Intelligence
What AI Component is this:
collect vast amount of data
data layer
What AI Component is this:
data scientists and engineer work together to understand use cases, requirements, and
frameworks that can solve them
- ML Framework and Algorithm Layer
What AI Component is this:
implement a model and train it, we have the structure, the parameters and functions, optimizer function
model layer
What AI Component is this:
how to serve the model,
and its capabilities for your users
application layer
- _______ is a type of AI for building methods that allow machines to learn
- Data is leveraged to improve computer performance on a set of task
- Make predictions based on data used to train the model
- No explicit programming of rules
Machine Learning
- Uses neurons and synapses (like our brain) to
train a model - Process more complex patterns in the data
than traditional ML
Deep Learning
True or False: Natural Language Processing is NOT an example of deep learning.
False, it is
Is generative AI a subset of deep learning?
Yes
- Powerful models that can understand and generate
human-like text - Trained on vast amounts of text data from the internet,
books, and other sources, and learn patterns and
relationships between words and phrases - Example: Google BERT, OpenAI ChatGPT
Transformer based LLMs
- Able to process a sentence as a whole instead of word by word
- Faster and more efficient text processing (less
training time) - It gives relative importance to specific words in a
sentence (more coherent sentences
Transformer based LLMs
- Does NOT rely on a single type of input (text, or images, or audio only)
- Does NOT create a single type of output
- Example: a ______ can take a mix of audio, image and text and output a mix of video, text for example
Multi-modal Models
– generate human text or computer code based on input prompt
GPT (Generative Pre-trained Transformer)
– similar intent to GPT,
but reads the text in two directions
BERT (Bidirectional Encoder Representations from Transformers)
– meant for sequential data such as time-series or text,
useful in speech recognition, time-series prediction
RNN (Recurrent Neural Network)
used for image recognition tasks, object detection, facial recognition
ResNet (Residual Network) – Deep Convolutional Neural Network (CNN)
– ML algorithm for classification and regression
SVM (Support Vector Machine)
– model to generate raw audio waveform, used in Speech Synthesis
WaveNet
– models used to generate synthetic data such as images, videos or sounds that resemble the training data. Helpful for data augmentatio
GAN (Generative Adversarial Network)
– an implementation of gradient boosting
XGBoost (Extreme Gradient Boosting)
Data includes both input features and corresponding output labels
labeled data
Data includes only input features without
any output labels
unlabeled data
- Data is organized in a structured format, often in rows and columns (like Excel)
structured data
Data is arranged in a table with rows representing records and columns representing features
tabular data
Data points collected or recorded at successive
points in time
time series data
- Data that doesn’t follow a specific structure and is often text-heavy or multimedia content
unstructured data
- Learn a mapping function that can predict the output for new unseen input data
- Needs labeled data: very powerful, but difficult to perform on millions of datapoints
supervised learning
- Used to predict a numeric value based on input data
- The output variable is continuous, meaning it can
take any value within a range
Supervised Learning – Regression
What type of supervised learning do these scenarios represent?
- Predicting House Prices – based on features like size,
location, and number of bedrooms - Stock Price Prediction – predicting the future price of a
stock based on historical data and other features - Weather Forecasting – predicting temperatures based on
historical weather data
regression
Used to predict the categorical label of input data
* The output variable is discrete, which means it falls into
a specific category or class
* Use cases: scenarios where decisions or predictions
need to be made between distinct categories (fraud,
image classification, customer retention, diagnostics)
Supervised Learning – Classification
- Used to train the model
- Percentage: typically, 60-80% of the dataset
- Example: 800 labeled images from a dataset of 1000 images
training set
- Used to tune model parameters and validate performance
- Percentage: typically, 10-20% of the dataset
- Example: 100 labeled images for hyperparameter tuning
(tune the settings of the algorithm to make it more efficient)
validation set
- Used to evaluate the final model performance
- Percentage: typically, 10-20% of the dataset
- Example: 100 labeled images to test the model’s accuracy
test set
- The process of using domain knowledge to
select and transform raw data into meaningful
features - Helps enhancing the performance of machine
learning models
feature engineering
– extracting useful information from raw data, such as deriving age from date of birth
feature extraction
– selecting a subset of relevant features, like choosing important predictors in a regression model
Feature Selection
– transforming data for better model performance, such as normalizing numerical data
Feature Transformation
- _______ – deriving new features like “price per square foot”
- _________ – identifying and retaining important features such as location
or number of bedrooms - _________ – normalizing features to ensure they are on a similar scale, which helps algorithms like gradient descent converge faster
Feature Creation
Feature Selection
Feature Transformation
- The goal is to discover inherent patterns, structures,
or relationships within the input data - The machine must uncover and create the groups
itself, but humans still put labels on the output groups
unsupervised learning
- Used to group similar data points together into clusters
based on their features
unsupervised learning - clustering
- Use a small amount of labeled data and a
large amount of unlabeled data to train
systems - After that, the partially trained algorithm
itself labels the unlabeled data - This is called pseudo-labeling
- The model is then re-trained on the
resulting data mix without being explicitly
programmed
Semi-supervised Learning
- A type of Machine Learning where an agent
learns to make decisions by performing actions in
an environment to maximize cumulative rewards
reinforcement learning
What are the associated reinforcement learning concepts:
- Key Concepts
- __– the learner or decision-maker
- _____– the external system the agent
interacts with - ____– the choices made by the agent
- ___– the feedback from the environment based
on the agent’s actions - __– the current situation of the environment
- __– the strategy the agent uses to determine
actions based on the state
- Agent – the learner or decision-maker
- Environment – the external system the agent
interacts with - Action – the choices made by the agent
- Reward – the feedback from the environment based
on the agent’s actions - State – the current situation of the environment
- Policy – the strategy the agent uses to determine
actions based on the state
The goal of __________ is to maximize cumulative reward over time.
reinforcement learning
What does RLHF stand for?
- RLHF = Reinforcement Learning from Human Feedback
- Use human feedback to help ML models to self-learn more efficiently
- In Reinforcement Learning there’s a reward function
- RLHF incorporates human feedback in the reward function, to be more
aligned with human goals, wants and needs - First, the model’s responses are compared to human’s responses
- Then, a human assess the quality of the model’s responses
- RLHF = Reinforcement Learning from Human Feedback
In case your model has poor
performance, you need to look at its ___
model fit
What kind of model fit is this:
- Performs well on the training data
- Doesn’t perform well on evaluation data
Overfitting
What kind of model fit is this:
- Model performs poorly on training data
- Could be a problem of having a model too
simple or poor data features
overfitting
What kind of model fit is this:
Model performs poorly on training data
* Could be a problem of having a model too
simple or poor data features
Underfitting
What kind of model fit is this:
-Neither overfitting or underfitting
Balanced fit
- Difference or error between predicted and actual value
- Occurs due to the wrong choice in the ML process
Bias
- The model doesn’t closely match the training data
- Example: linear regression function on a non-linear dataset * Considered as underfitting
High bias
How much the performance of a model changes if
trained on a different dataset which has a similar
distribution
Variance
- Model is very sensitive to changes in the training data
- This is the case when overfitting: performs well on
training data, but poorly on unseen test data
high variance
how can you reduce variance?
Feature selection (less, more important features)
* Split into training and test data sets multiple times
Precision or Recall?
True Positives / (True Positives + False Positives)
Precision
Precision or Recall?
True Positives / (True Positives + False Negatives)
Recall
- _____– Best when false positives are costly
- ____– Best when false negatives are costly
- ______ – Best when you want a balance between precision and recall, especially in imbalanced datasets
- ______ – Best for balanced datasets
Precision
Recall
F1 Score
Accuracy
- AUC-ROC shows what the curve for true positive compared to false positive looks like at various
thresholds, with multiple confusion matrixes - You compare them to one another to find out the threshold you need for your business use case.
AUC-ROC
Area under the curve-receiver operator curve
- ________is when a model is making prediction on new data
Inferencing
- Settings that define the model structure and learning algorithm and process
- Set before training begins
- Examples: learning rate, batch size, number of epochs, and regularization
Hyperparameter
- Finding the best ______ values to optimize the model performance
hyperparameters
What hyperparameter is this:
How large or small the steps are when updating the model’s weights during training
* High ________ can lead to faster convergence but risks overshooting the optimal
solution, while a low learning rate may result in more precise but slower convergence.
learning rate
What hyperparamater is this:
- Number of training examples used to update the model weights in one iteration
- Smaller batches can lead to more stable learning but require more time to compute,
while larger batches are faster but may lead to less stable updates.
batch size
what hyperparameter is this:
- Refers to how many times the model will iterate over the entire training dataset.
- Too few epochs can lead to underfitting, while too many may cause overfitting
Number of epochs
- _______ is when the model gives good predictions for training data
but not for the new data
Overfitting
_______ are pre-trained ML services for your use case
AWS AI Services
- For Natural Language Processing – NLP
- Fully managed and serverless service
- Uses machine learning to find insights and relationships in text
- Language of the text
- Extracts key phrases, places, people, brands, or events
- Understands how positive or negative the text is
- Analyzes text using tokenization and parts of speech
- Automatically organizes a collection of text files by topic
- Sample use cases:
- analyze customer interactions (emails) to find what leads to a positive or negative experience
- Create and groups articles by topics that Comprehend will uncover
Amazon Comprehend
Extracts predefined, general-purpose entities like people, places, organizations, dates, and other standard categories, from text
Named Entity Recognition (NER)
- Natural and accurate language translation
- ________ allows you to localize content - such as websites and
applications - for international users, and to easily translate large volumes of text efficiently.
Amazon Translate
- Automatically convert speech to text
- Uses a deep learning process called automatic speech recognition (ASR) to convert speech to text quickly and accurately
- Automatically remove Personally Identifiable Information (PII) using Redaction
- Supports Automatic Language Identification for multi-lingual audio
- Use cases:
- transcribe customer service calls
- automate closed captioning and subtitling
- generate metadata for media assets to create a fully searchable archive
amazon transcribe
What managed service:
Turn text into lifelike speech using deep learning
* Allowing you to create applications that talk
amazon polly
What AWS Managed service:
- Find objects, people, text, scenes in images and videos using ML
- Facial analysis and facial search to do user verification, people counting
- Create a database of “familiar faces” or compare against celebrities
- Use cases:
- Labeling
- Content Moderation
- Text Detection
- Face Detection and Analysis (gender, age range, emotions…)
- Face Search and Verification
- Celebrity Recognition
- Pathing (ex: for sports game analysis)
Amazon Rekognition
- Fully managed service that uses ML to deliver highly accurate forecasts
- Example: predict the future sales of a raincoat
- 50% more accurate than looking at the data itself
- Reduce forecasting time from months to hours
- Use cases: Product Demand Planning, Financial Planning, Resource Planning, …
amazon foorecast
What managed service:
- Build chatbots quickly for your
applications using voice and text - Example: a chatbot that allows your customers to order pizzas or book a hotel
- Supports multiple languages * Integration with AWS Lambda,
Connect, Comprehend, Kendra - The bot automatically understands the
user intent to invoke the correct
Lambda function to “fulfill the intent” - The bot will ask for ”Slots” (input parameters) if necessary
amazon Lex
- Fully managed ML-service to build apps with real-time personalized recommendations
- Example: personalized product recommendations/re-ranking, customized direct marketing
- Example: User bought gardening tools, provide recommendations on the next one to buy
- Same technology used by Amazon.com
- Integrates into existing websites, applications, SMS, email marketing systems, …
- Implement in days, not months (you don’t need to build, train, and deploy ML solutions)
- Use cases: retail stores, media and entertainment…
Amazon Personalize
- Automatically extracts text, handwriting, and data from any scanned
documents using AI and ML
amazon textract
- Fully managed document search service powered by Machine Learning
- Extract answers from within a document (text, pdf, HTML, PowerPoint, MS Word, FAQs…)
- Natural language search capabilities
- Learn from user interactions/feedback to promote preferred results (Incremental Learning)
- Ability to manually fine-tune search results (importance of data, freshness, custom, …)
Amazon Kendra
- Crowdsourcing marketplace to perform simple human
tasks - Distributed virtual workforce * Example: * You have a dataset of 10,000,000 images and you want to
labels these images - You distribute the task on Mechanical Turk and humans
will tag those images - You set the reward per image (for example $0.10 per
image) - Use cases: image classification, data collection, business
processing
Amazon Mechanical Turk
- Human oversight of Machine Learning predictions in production
- Can be your own employees, over 500,000 contractors from AWS, or AWS Mechanical Turk
- Some vendors are pre-screened for confidentiality requirements
- The ML model can be built on AWS or elsewhere (SageMaker, Rekognition…)
Amazon Augmented AI (A2I)
- Fully autonomous 1/18th scale car race driven by
Reinforcement Learning (RL)
AWS DeepRacer
- Fully managed service for developers / data scientists to build ML models
- Typically, difficult to do all the processes in one place + provision servers
- Example: predicting your AWS exam score
SageMaker
These are examples of a SageMaker service:
- Supervised Algorithms
- Unsupervised Algorithms
- Textual Algorithms
- Image Processing
sagemaker built in algorithms
- Define the Objective Metric
- _____ automatically chooses
hyperparameter ranges, search
strategy, maximum runtime of a tuning job, and early stop
condition - Saves you time and money
- Helps you not wasting money
on suboptimal configurations
SageMaker – Automatic Model Tuning (AMT)
This is an example of batch or asynchronous sagemaker model deployment:
- For large payload sizes up to 1GB
- Long processing times
- Near-real time latency requirements
- Request and responses are in Amazon S3
Asynchronous
This is an example of batch or asynchronous sagemaker model deployment:
- Prediction for an entire dataset (multiple
predictions) - Request and responses are in Amazon S3
batch
- End-to-end ML development from a unified interface
- Team collaboration
- Tune and debug ML models
- Deploy ML models
- Automated workflow
sagemaker studio
- Prepare tabular and image data for machine learning
- Data preparation, transformation and
feature engineering - Single interface for data selection, cleansing, exploration, visualization,
and processing - SQL support * Data Quality tool
SageMaker
– Data Wrangler
_____are inputs to ML
models used during training and
used for inference
* Example - music dataset: song
ratings, listening duration, and
listener demographics
Features
- Ingests features from a variety of sources
- Ability to define the transformation of data into feature from within Feature Store
- Can publish directly from SageMaker Data Wrangler into SageMaker Feature Store
- Features are discoverable within SageMaker Studio
SageMaker – Feature Store
- Evaluate Foundation Models
- Evaluating human-factors such as friendliness or humor
- Leverage an AWS
-managed team
or bring your own employees - Use built
-in datasets or bring your
own dataset - Built-in metrics and algorithms
SageMaker Clarify
- A set of tools to help explain how machine
learning (ML) models make predictions - Understand model characteristics as a whole
prior to deployment - Debug predictions provided by the model
after it’s deployed - Helps increase the trust and understanding of
the model - Example:
- “Why did the model predict a negative outcome
such as a loan rejection for a given applicant?” - “Why did the model make an incorrect
prediction?”
SageMaker Clarify - Model Explainability
- Ability to detect and
explain biases in your
datasets and models - Measure bias using
statistical metrics - Specify input features and
bias will be automatically
detected
SageMaker Clarify – Detect Bias (human)
_______ occurs when the training data does not represent the full population fairly, leading to a model that over-represents or disproportionately affects certain group
Sampling bias
____ occurs when the tools or measurements used
in data collection are flawed or skewed
: Measurement bias
_________ happens when the person collecting or interpreting the data has personal biases that affect the result
Observer bias
_________is when individuals interpret or favor information that confirms their preconceptions. This is more applicable to human
decision-making rather than automated model outputs.
Confirmation bias
- Model review, customization and evaluation
- Align model to human preferences
- Reinforcement learning where human feedback is
included in the “reward” function
- RLHF – Reinforcement Learning from
Human Feedback
- Define roles for personas
- Example: data scientists, MLOps engineers
- SageMaker Role Manager
- Centralized portal where you can view, search, and explore all of your models
- Information and insights for all models
- SageMaker Model Dashboard
- Monitor the quality of your model in production: continuous or on-schedule
- Alerts for deviations in the model quality: fix data & retrain model
- Example: loan model starts giving loans to people who don’t have the correct credit score (drift)
SageMaker – Model Monitor
- Centralized repository allows you to track, manage, and version ML models
- Catalog models, manage model versions, associate metadata with a model
- Manage approval status of a model, automate model deployment, share models…
SageMaker – Model Registry
- a workflow that
automates the process of building,
training, and deploying a ML mode - Continuous Integration and
Continuous Delivery (CI/CD) service
for Machine Learning - Helps you easily build, train, test, and
deploy 100s of models automatically - Iterate faster, reduce errors (no manual
steps), repeatable mechanisms
SageMaker Pipeline –
- ML Hub to find pre-trained Foundation
Model (FM), computer vision models, or
natural language processing models - Large collection of models from Hugging
Face, Databricks, Meta, Stability AI… - Models can be fully customized for your data
and use
-case - Models are deployed on SageMaker directly
(full control of deployment options) - Pre
-built ML solutions for demand
forecasting, credit rate prediction, fraud
detection and computer vision
SageMaker JumpStart * ML Hub to find pre-trained Foundation
- Build ML models using a visual interface
(no coding required) - Access to ready-to-use models from Bedrock or JumpStart
- Build your own custom model using AutoML powered by SageMaker Autopilot
- Part of SageMaker Studio
- Leverage Data Wrangler for data preparation
SageMaker Canvas * Build ML models using a visual interface
an open-source tool which
helps ML teams manage the entire ML
lifecycle
MLFlow
What is sagemaker ground truth used for?
RLHF, humans for model grading and data labeling
Sagemaker role manager is used for ______
access control
- Making sure AI systems are transparent and trustworthy
- Mitigating potential risk and negative outcomes
- Throughout the AI lifecycle: design, development, deployment,
monitoring, evaluation
- Responsible AI
- Ensure to add value and manage risk in the operation of business
- Clear policies, guidelines, and oversight mechanisms to ensure AI
systems align with legal and regulatory requirements - Improve trust
- Governance
- Ensure adherence to regulations and guidelines
- Sensitive domains such as healthcare, finance, and legal applications
Compliance
- Form of responsible AI
documentation - Help understand the service
and its features - Find intended use cases and
limitations - Responsible AI design choices
- Deployment and
performance optimization
best practices
AWS AI Service Cards
- The degree to which a human can understand the cause of a decision
- Access into the system so that a human can interpret the model’s output
- Answer “why and how”
- Interpretability
- Understand the nature and behavior of the
model - Being able to look at inputs and outputs
and explain without understanding exactly
how the model came to the conclusion
explainability
- Show how a single feature can
influence the predicted outcome, while holding other features constant - Particularly helpful when the model
is “black box” (i.e., Neural Networks) - Helps with interpretability and
explainability
Partial Dependence Plots (PDP) * Show how a single feature can
- Approach to design AI systems with priorities for humans’ needs
Human-Centered Design (HCD) for
Explainable AI
Generating content that is offensive, disturbing, or inappropriate
toxicity
- Assertions or claims that sound true, but are incorrect
- This is due to the next
-word probability
sampling employed by LLM
Hallucinations
- Intentional introduction of malicious or biased data
into the training dataset of a model - Leads to the model producing biased, offensive, or
harmful outputs (intentionally or unintentionally)
poisoning
- Influencing the outputs by embedding specific
instructions within the prompts themselves - Hijack the model’s behavior and make it produce
outputs that align with the attacker’s intentions
(e.g., generating misinformation or running
malicious code)
Hijaking and Prompt Injection
- The risk of exposing sensitive or confidential
information to a model during training or
inference - The model can then reveal this sensitive data
from their training corpus, leading to potential
data leaks or privacy violations
exposure
- The unintentional disclosure or leakage of the
prompts or inputs used within a model - It can expose protected data or other data used
by the model, such as how the model works
prompt leaking
- AI models are typically trained with certain ethical and safety constraints in place to prevent misuse or harmful outputs (e.g., filtering out offensive content, restricting access
to sensitive information…) - Circumvent the constraints and safety measures implemented in a generative model to gain
unauthorized access or functionality
jailbreaking
– principles, guidelines, and responsible AI considerations
* Data management, model training, output validation, safety, and human oversight
* Intellectual property, bias mitigation, and privacy protection
policies
– combination of technical, legal, and responsible AI review
* Clear timeline: monthly, quarterly, annually…
* Include Subject Matter Experts (SMEs), legal and compliance teams and end-users
review cadence
- Technical reviews on model performance, data quality, algorithm robustness
- Non-technical reviews on policies, responsible AI principles, regulatory requirements
- Testing and validation procedure for outputs before deploying a new model
- Clear decision-making frameworks to make decisions based on review results
review strategies
- Publishing information about the AI models, training data, key decisions made
- Documentation on limitations, capabilities and use cases of AI solutions
- Channels for end-users and stakeholders to provide feedback and raise concerns
transparency standards
- Train on relevant policies, guidelines, and best practices
- Training on bias mitigation and responsible AI practices
- Encourage cross-functional collaboration and knowledge-sharing
- Implement a training and certification program
team training requirements
- Responsible framework and guidelines (bias, fairness, transparency, accountability)
- Monitor AI and Generative AI for potential bias, fairness issue, and unintended consequences
- Educate and train teams on responsible AI practices
responsible AI
- Attributing and acknowledging the sources of the data * Datasets, databases, other sources * Relevant licenses, terms of use, or permissions
source citation
- Example: generating fake content, manipulated data, automated attacks * Deploy AI-based threat detection systems * Analyze network traffic, user behavior, and other relevant data sources
threat detection
- Identify vulnerabilities in AI systems: software bugs, model weaknesses… * Conduct security assessment, penetration testing and code reviews * Patch management and update processes
vulnerability management
- Secure the cloud computing platform, edge devices, data stores * Access control, network segmentation, encryption * Ensure you can withstand systems failures
infrastructure protection
- Manipulated input prompts to generate malicious or undesirable content
- Implement guardrails: prompt filtering, sanitization, validation
prompt injection
______ – ratio of true positive predictions (correct vs. incorrect positive prediction)
* ______– ratio of true positive predictions compare to actual positive
precision
Recall
True or false:
- AWS responsibility - Security of the Cloud
true
True or false.
- Customer responsibility - not Security in the Cloud
false
- For Bedrock, customer is responsible for data management, access controls,
setting up guardrails, etc… - Encrypting application data
- Make sure models aren’t just developed but also deployed, monitored,
retrained systematically and repeatedly - Extension of DevOps to deploy code regularly
MLOps
- Users or Groups can be
assigned JSON documents
called _____ - These _____define the
______of the users
policies
permissions
- ____are people within your organization, and can be grouped
Users
- EC2 =
Elastic Compute Cloud
_____ is a fully managed data security and data privacy service
that uses machine learning and pattern matching to discover and protect your sensitive data in AWS.
* ___helps identify and alert you to sensitive data, such as personal info.
Amazon Macie
- Helps with auditing and recording compliance of your AWS resources
- Helps record configurations and changes over time
AWS Config
- Automated Security Assessments
Amazon inspector
- Provides governance, compliance and audit for your AWS Account
AWS CloudTrail
- Portal that provides customers with on-demand access to AWS
compliance documentation and AWS agreements
AWS Artifact
- On-demand access to security compliance
reports of Independent Software Vendors
(ISVs)
AWS Artifact - third party reports
- Assess risk and compliance of your AWS workloads
- Continuously audit AWS services usage and prepare audits
AWS Audit Manager
- No need to install anything
– high level AWS account assessment - Analyze your AWS accounts and provides
recommendation on 6 categories: * Cost optimization * Performance * Security * Fault tolerance * Service limits * Operational Excellence
AWS Trusted advisor
private
network to deploy your resources
(regional resource)
- VPC - Virtual Private Cloud
_______allow you to partition your
network inside your VPC
(Availability Zone resource)
Subnets
- A __________ is a subnet that is
accessible from the internet
public subnet
- A _________ is a subnet that is
not accessible from the internet
private subnet
- _____ helps our VPC
instances connect with the internet
Internet Gateway
- ______ (AWS-managed) allow
your instances in your Private Subnets
to access the internet while remaining
private
NAT Gateways
We want to use ________
* Access an AWS service privately without
going over the public internet
* Usually powered by AWS PrivateLink
* Keep your network traffic internal to AWS
* Example: your application deployed in a VPC
can access a Bedrock model privately
VPC endpoints