Domain 4: Responsible AI Guidelines 14% Flashcards
_____ is a set of guidelines and principles to ensure that AI systems operate in a safe, trustworthy, and ethical manner.
Responsible AI
_____ aims to ensure that models treat everyone equitably and impartially, regardless of their age, where they live, their gender, or their ethnicity.
Fairness
To explain in human terms why a model made a particular decision, which is known as _____.
Explanibility
_____ makes sure that AI systems are tolerant of failures and minimize errors.
robustness
_____ are primarily about protecting user privacy, and not exposing personal identifiable information, or PII.
Privacy and security
_____ is about meeting and auditing compliance with industry standards and best practices, including estimating properly and mitigating risk.
Governance
_____ is about providing clear information about model capabilities, limitations, and potential risks to stakeholders; it includes making sure that users know when they are interacting with AI.
Transparency
Fairness avoids perpetuating or amplifying _____ and _____ through AI systems.
societal biases and discrimination
Fairness of a model is measured by _____.
the bias and variance of outcomes across different groups
_____ becomes a problem when the training dataset is not representative of the real world. As a result, the model only performs well on inputs that resemble the training data.
Overfitting
_____ can occur for some groups when there wasn’t enough training data that matched their characteristics, so the model doesn’t perform well for them.
Underfitting
One of the main reasons for model bias is _____, which occurs when a feature value has fewer training samples when compared with another value in the dataset.
class imbalance
What is the foundation of responsible AI?
Ethical datasets
_____ is about representing diverse populations, perspectives, and experiences in our training data.
Inclusivity
_____ means incorporating a wide range of attributes, features, and variables to avoid bias.
Diversity
_____ involve obtaining informed consent from data subjects and providing clear information about data usage.
Consent and transparency
What ethical practices/factors must be considered when choosing a model?
- Environmental impact
- Sustainability
- Transparency
- Accountability
- Stakeholder engagement
_____ means establishing clear lines of responsibility for AI model outcomes and decision making.
Accountability
_____ are imbalances in data, or disparities in the performance of a model across different groups.
Biases
_____ helps you mitigate bias by detecting potential bias during the data preparation, after model training, and in your deployed model, by examining specific attributes. It also improves explainability by looking at the inputs and outputs for your model, treating the model itself as a black box.
SageMaker Clarify
T/F: SageMaker Clarify can understand the basis for how deep learning models are making predictions without understanding the inner workings.
True
SageMaker Clarify examines your dataset and model by using processing jobs, which uses the SageMaker Clarify processing container to interact with an Amazon S3 bucket, which contains your input datasets and a model that is deployed to _____.
a SageMaker inference endpoint
The SageMaker Clarify processing container obtains _____ and _____ from an S3 bucket.
the input data set and configuration for analysis
For _____, the SageMaker Clarify processing container sends requests to the model container, and retrieves model predictions from the response from the model container.
feature analysis
What analysis results are saved to the S3 bucket by the processing container by SageMaker Clarify?
- JSON file w/ bias metrics
- visual report
- additional files for local feature attributions
The _____ metric indicates whether a particular class has a larger proportion of the rejected outcomes in the dataset than the accepted outcomes.
demographic disparity
The _____ metric indicates whether the model predicts positive outcomes differently for each class. This metric can be compared with the label imbalance in the training data. The goal is to see whether the bias in positive proportions changes after the training, or whether the bias is also present in the data.
difference in positive proportions in predictions
_____ measures how often the model correctly predicts a negative outcome.
specificity
The _____ metric is the difference in recall of the model between two classes. Any difference in these recalls is a potential form of bias.
recall difference
The _____ metric is the difference between the prediction accuracies for different classes. This result can occur when the data contains class imbalance.
accuracy difference
The _____ is the difference in the ratio of false negatives to false positives.
treatment equality
_____ is the result of the AI model attempting to fill in the gaps when something is missing in its training data.
Hallucination
T/F: Data privacy is a risk because sensitive data that makes its way into a large language model can leak and be incorporated into its output.
True
T/F: Fortunately, for foundation models in Amazon Bedrock, you can configure guardrails to filter and block inappropriate content using plaintext to describe topics that should be denied.
True
What are the 4 types of tasks that can be run in SageMaker Clarify? These tasks can also be run in what?
- text generation
- text classification
- question/answer
- text summarization
Bedrock
_____ measures the probability of your model including biases in its response. It includes biases for race, gender, sexual orientation, religion, age, nationality, disability, physical appearance, and socioeconomic status.
Prompt stereotyping
_____ checks your model for sexual references, rude, unreasonable, hateful, or aggressive comments, profanity, insults, flirtations, attacks on identities and threats.
Toxicity
_____ checks the veracity of the model responses.
Factual knowledge
_____ checks whether your model output changes because of keyword typos, random changes to uppercase, and random additions or deletions of white spaces.
Semantic robustness
_____ compares the model output to the expected responses, such as classifying and summarizing the data correctly.
Accuracy
A model’s _____measures the degree to which ML owners and stakeholders can understand how a model works and why it produces its outputs. How much of this that’s required often depends upon regulatory requirements so that consumers are protected against bias and unfairness.
transparency
What are the two measures of transparency?
interpretability and explainability
T/F: Linear regression would be on the simple end and neural network on the complex end.
True
_____ is being able to describe what a model is doing without knowing exactly how. It treats the model as a black box, so every model can be observed and explained.
Explainability
T/F: When starting a new AI or ML project, we need to know whether interpretability is a hard business requirement.
True
T/F: With interpretability, you can document how the inner mechanisms of the model impact the output, but explainability does not consider the inner mechanisms.
True
What are two tradeoffs you should consider when choosing a model w/ high transparency?
performance and security
T/F: Transparent AI models are more susceptible to attacks because hackers have more information about the inner mechanisms and can find vulnerabilities in the model.
True
Transparency might require sharing details about the data that is used to train the model, which raises concerns about _____.
data privacy
Who services open source AI projects?
GitHub
T/F: When using a fully trained model that’s hosted by AWS, you only interact with APIs and have no direct access to the model.
True
_____ are a form of responsible AI documentation. They provide customers with a single place to learn about the intended use cases, limitations, responsible AI design choices, and deployment and performance optimization best practice.
AI service cards
Match faces withm_____ , analyze IDs with _____ , and detect PII with _____.
Amazon Rekognition
Amazon Textract
Amazon Comprehend
For models that you create, you can use _____ to help document the lifecycle of a model from designing, building, training, and evaluation.
SageMaker Model Cards
T/F: When you create a model card, SageMaker autopopulates details about your SageMaker trained model in the card.
True
You can use _____ to determine the contribution that each feature made to the model predictions.
Shapley values
This plot shows you how a model’s predictions changes for different values of a feature.
partial dependence plot
In _____, designers and developers engage in interdisciplinary collaboration and often involve psychologists, ethicists, and domain experts to collect diverse perspectives and expertise.
human-centered AI
_____ incorporates human review for samples of the inferences made by an AWS AI service or a custom model. You can configure it to send inferences with low-confidence scores to human reviewers before sending them to the client. Their feedback can then be added to training data to re-train the model.
Amazon Augmented AI or Amazon A2I
_____ is an industry standard technique for ensuring that large language models produce content that is truthful, harmless, and helpful.
Reinforcement learning from human feedback, or RLHF
To use this technique, you train a separate model which serves as a reward model. The reward model is trained by humans who review multiple responses from the large language model for the same prompt and indicate their preferred response. Their preferences become the training data for the reward model, which when trained, can predict how high a human would score a prompt response. The large language model then uses the reward model to refine its responses for maximum reward.
RLHF
Collecting the preferences from humans for RLHF can be accomplished most readily with _____.
SageMaker Ground Truth