Domain 4 Flashcards
Responsible AI
is a set of guidelines and principles to ensure that AI systems operate in a safe, trustworthy, and ethical manner
Fairness
aims to ensure that models treat everyone equitably and impartially, regardless of their age, where they live, their gender, or their ethnicity
It’s important to be able to explain in human terms why a model made a particular decision,
Explainability
Transparency
is about providing clear information about model capabilities, limitations, and potential risks to stakeholders. Transparency includes making sure that users know when they are interacting with AI.
Fairness of a model is measured by the bias and variance of outcomes across different groups.
the bias and variance of outcomes across different groups.
Overfitting becomes a problem when
the training dataset is not representative of the real world. As a result, the model only performs well on inputs that resemble the training data.
Underfitting can occur for some groups when
there wasn’t enough training data that matched their characteristics, so the model doesn’t perform well for them.
Class imbalance occurs when
a feature value has fewer training samples when compared with another value in the dataset. In this example, the feature for sex shows that women constitute 32.4% of the training data, whereas men constitute 67.6%.
These are crucial for conducting periodic reviews of datasets to identify and address potential issues or biases
Regular Audits
Consider using these as a starting point to reduce the amount of training that your model needs, reducing your environmental impact and sustainability
already-trained model. Reuse of existing work is the key principle of sustainability
Transparency is
about providing clear information about model capabilities, limitations, and potential risks. It also means making sure that users know when they are using AI.
Accountability
means establishing clear lines of responsibility for AI model outcomes and decision making.
Biases
are imbalances in data, or disparities in the performance of a model across different groups.
SageMaker Clarify helps you mitigate bias by
detecting potential bias during the data preparation, after model training, and in your deployed model, by examining specific attributes.
SageMaker Clarify can improve explainability by
looking at the inputs and outputs for your model, treating the model itself as a black box. By making these observations, it determines the relative importance of each feature.
How does SageMaker Clarify evaluate bias, etc?
SageMaker Clarify examines your dataset and model by using processing jobs. A SageMaker Clarify processing job uses the SageMaker Clarify processing container to interact with an Amazon S3 bucket. The S3 bucket would contain your input datasets, and a model that is deployed to a SageMaker inference endpoint. The SageMaker Clarify processing container obtains the input data set and configuration for analysis from an S3 bucket. For feature analysis, the SageMaker Clarify processing container sends requests to the model container, and retrieves model predictions from the response from the model container. After that step, the processing container computes and saves analysis results to the S3 bucket. These results include a JSON file with bias metrics, and global feature attributions, a visual report, and additional files for local feature attributions. You can download the results from the output location and view them.
SageMaker Clarify: The difference in positive proportions in predictions metric indicates whether
the model predicts positive outcomes differently for each class.
specificity measures
how often the model correctly predicts a negative outcome
The recall difference metric is
the difference in recall of the model between two classes. Any difference in these recalls is a potential form of bias.
The accuracy difference metric
is the difference between the prediction accuracies for different classes.
The treatment equality is
the difference in the ratio of false negatives to false positives. Even if the accuracy of the model is the same for two classes, this ratio could have differences. A difference in the type of errors that occur for different classes can constitute bias
guardrails in Amazon Bedrock
You can use guardrails to define threshold for content filters for hate, insults, sexual content, or violence. You can also block topics altogether. For these topics, you can use plain text to describe the topics that should be denied.
Guardrails can be set on both
the prompt and the model response, so if a prompt passes the guardrail, the response can still be blocked
Another feature in SageMaker Clarify is the ability to run evaluation jobs of large language models so that
you can compare models
SageMaker Clarify evaluation jobs can run four different types of tasks, including
text generation, text classification, question and answering, and text summarization.
The five different dimensions of the four different types of SageMaker Clarify evaluation jobs
Prompt stereotyping measures the probability of your model including biases in its response. It includes biases for race, gender, sexual orientation, religion, age, nationality, disability, physical appearance, and socioeconomic status. Toxicity checks your model for sexual references, rude, unreasonable, hateful, or aggressive comments, profanity, insults, flirtations, attacks on identities and threats. Factual knowledge checks the veracity of the model responses. Semantic robustness checks whether your model output changes because of keyword typos, random changes to uppercase, and random additions or deletions of white spaces. Accuracy compares the model output to the expected responses, such as classifying and summarizing the data correctly.
A model’s transparency measures
the degree to which ML owners and stakeholders can understand how a model works and why it produces its outputs. A model that is highly transparent uses an algorithm that is straightforward to interpret, such as linear regression.
Transparency has two measures
interpretability and explainability.
Explainability is
being able to describe what a model is doing without knowing exactly how. It treats the model as a black box, so every model can be observed and explained
With interpretability
you can document how the inner mechanisms of the model impact the output,
tradeoffs when choosing a model with high transparency.
These tradeoffs are performance and security.
Transparent AI models are more susceptible to attacks because
hackers have more information about the inner mechanisms and can find vulnerabilities in the model.
AI service cards are
a form of responsible AI documentation. They provide customers with a single place to learn about the intended use cases, limitations, responsible AI design choices, and deployment and performance optimization best practices,
AI service cards currently exist for several AWS AI service APIs.
These APIs include matching faces with Amazon Rekognition, analyzing IDs with Amazon Textract, detecting PII with Amazon Comprehend and more. There is also an AI service card for Amazon’s foundation model in Amazon Bedrock, Amazon Titan Text
Is there a tool that helps you create your own AI service cards build on AWS?
for models that you create, you can use SageMaker Model Cards to help document the lifecycle of a model from designing, building, training, and evaluation.
Discuss two capabilities of SageMaker Clarify around responsible AI
SageMaker Clarify model processing jobs can also report on explainability. SageMaker Clarify provides feature attributions based on the concept of Shapley values. You can use Shapley values to determine the contribution that each feature made to the model predictions. Another type of analysis available in SageMaker Clarify is a partial dependence plot. This plot shows you how a model’s predictions changes for different values of a feature. In this case, it looks at age.
Human-centered AI refers to
designing AI systems that prioritize the needs and values of humans. In human-centered AI, designers and developers engage in interdisciplinary collaboration and often involve psychologists, ethicists, and domain experts to collect diverse perspectives and expertise. Users are involved in the development process to make sure that the AI will be genuinely beneficial and user-friendly.
Amazon Augmented AI or Amazon A2I, incorporates human review for samples of the inferences made by an AWS AI service or a custom model.
You can configure Amazon A2I to send inferences with low-confidence scores to human reviewers before sending them to the client. Their feedback can then be added to training data to re-train the model. Besides reviewing low-confidence inferences, you can have the human reviewers review random predictions as a way to audit the model. With Amazon A2I, you can use a pool of reviewers in your own organization or use Mechanical Turk. You can figure how many reviewers need to review each prediction.
Reinforcement learning from human feedback, or RLHF,
is an industry standard technique for ensuring that large language models produce content that is truthful, harmless, and helpful.
To use RLHF, you train a separate model which serves as a reward model.
The reward model is trained by humans who review multiple responses from the large language model for the same prompt and indicate their preferred response. Their preferences become the training data for the reward model, which when trained, can predict how high a human would score a prompt response. The large language model then uses the reward model to refine its responses for maximum reward.
Collecting the preferences from humans for RLHF can be accomplished most readily with
SageMaker Ground Truth