Module 5: Implementing AI Projects and Systems: Testing and Validating the AI System during Deployment Flashcards
List the different types of AI testing.
- Accuracy
- Robustness
- Reliability
- Privacy
- Interpretability
- Safety
- Bias
List different types of potential AI bias.
- Computational bias
- Cognitive bias
- Societal bias
List considerations for testing and validating AI systems.
- Use cases (not do the same testing and evaluation for each algorithm)
- Resources (e.g. OECD’s Catalogue of Metrics and Tools for Trustworthy AI)
- PETs
- Adversarial and threat modeling
- Multiple layers of mitigation
- Unique attributes such as: brittleness, hallucinations, embedded bias, uncertainty and false positives.
- Understanding the trade-offs among mitigation strategies.
- Documentation of all decisions.
Name some types of Privacy Enhancing Technologies (PETs) which can be applied to AI training and testing data.
- Homomorphic encryption
- Differential privacy
- Deidentification/obfuscation techniques
- Federated learning
What are some considerations for testing and validating AI systems?
- Use cases: Align testing data and processes for the specific use case
- Resources: Understand what resources you have and where best to put them to address risks and mitigations.
- Conduct adversarial testing and threat modeling to identify security threats.
- Establish multiple layers of mitigation to stop system errors or failures at different levels or modules of the AI system.
- Evaluate AI systems for attributes unique to them, such as brittleness, hallucinations, embedded bias, uncertainty and false positives.
- Understand trade-offs among mitigation strategies.
- PETs: Apply PETs to training and testing data.
- Documentation: Document all decisions the stakeholders group makes.
What are some questions to ask when implementing tests during system monitoring?
1) Were the goals achieved?
- Automation bias: do not rely solely on output to determine this; human interpretation and oversight should be included in the evaluation.
2) As the system is in use, are there secondary or unintended outputs?
- Do these result in additional risks or harms that need to be addressed?
- Can these or others be predicted by using a challenger model?
What are some elements of an AI response plan?
1) Document the model version and the dataset used for the model
- Allows for challenger models to be accurately created
- Allows for transparency with regulatory agencies and consumers
2) Respond to internal and external risks
- Prioritize and determine the risk level and appropriate response; create a “risk score”
- Conduct internal or external red teaming exercises for generative AI systems (may also be done pre-deployment)
- Consider bug bashing/bug bounties to generate user engagement and extensive feedback