Lecture 13 - Explainable ML Flashcards
Introduction and interpretable models
What is interpretability?
Ability to explain or to present in understandable terms to a human:
- The degree to which a human can understand the cause of a decision
- The degree to which a human can consistently predict the result of a model
What is an explanation?
Answer to a WHY question.
- An explanation usually relates the feature values of an instance to its model prediction in a
humanly understandable way.
Why do we need XAI?
Scientific Understanding,
Bias / fairness issues,
Model debugging and auditing,
Human-AI cooperation / acceptance,
Regulatory compliance,
High-risk applications & regulated industries.
What is Intrinsic Interpretability?
A model is inherently easy to understand due to its simplicity, without requiring additional tools.
What is Post-hoc Interpretability?
Another method or model is used after training to interpret the predictions of a complex or opaque model.
What is Model-Agnostic?
Interpretation methods that can be applied to any model, treating it as a black-box.
What is Model-Specific?
Interpretation methods that require access to and make use of a model’s internal structure or components.
What does Local (Instance-Level) explains?
Explains individual predictions.
What does Global (Model-Level) explains?
Explains the overall model behavior.
What does Intermediate (Group-Level) explains?
Explains predictions for groups or subsets of data.
What does “Result of the interpretation method,” refer to?
refers to the form of explanation produced by XAI techniques.
Name five result of the interpretation method
- Result of the interpretation method
- Feature summary
visualization - Model internals
- Data points
- Global or local surrogates via intrinsically interpretable models
What is the result of interpretation methods that provide a feature summary statistic?
- Feature importance (E[feature importance]).
- Feature interaction strengths.
These summarize the influence or relationships between features across the model.
What does a feature summary visualization show?
- Partial dependence plot (PDP): Shows the marginal effect of a feature on the predicted outcome.
- Feature importance plot: Ranks features based on their influence on the model’s predictions.
What is provided by interpretation methods that show model internals?
- Linear model weights: Reflect the importance of features in linear models.
- Decision tree structure: Provides “if-then” rules and decision paths.
- Filters: Commonly found in neural networks, where filters capture feature patterns (e.g., in convolutional neural networks).
What is the result of interpretation methods focused on data points?
- Exemplars: Representative examples of data points that explain model behavior.
- Counterfactual explanations: Show how changing certain features would change the model’s prediction.
What is the purpose of global or local surrogates?
Surrogates are interpretable models (like decision trees or linear models) that approximate the behavior of a complex “black-box” model, providing explanations at a global (whole model) or local (individual instance) level.
What does Expressive Power refer to in explanation methods?
Expressive Power refers to the “language” or structure of the explanations provided.
What does Translucency describe in the context of explanation methods?
Translucency describes how much the explanation method relies on looking into the machine learning model itself.
What does Portability describe in explanation methods?
Portability refers to the range of machine learning models that an explanation method can be used with.
What does Algorithmic Complexity measure in explanation methods?
Algorithmic Complexity refers to the computational complexity of the explanation method, indicating how much time and resources it takes to compute the explanations.
What does Accuracy refer to in individual explanations?
Accuracy refers to how well an explanation can predict unseen data.
It measures whether the explanation can generalize beyond the examples used to generate it.