week 3 Flashcards
What is XAI (Explainable AI)?
XAI focuses on making AI models transparent and understandable, aiding debugging, improvement, and building trust.
What are the two main approaches to achieving model understanding?
- Build inherently explainable models (e.g., decision trees, linear regression). 2. Explain pre-built models in a post-hoc manner (e.g., LIME, SHAP).
What are intrinsic methods in XAI?
Intrinsic methods are explanations built into the model itself, such as decision trees or linear regression models.
What are post-hoc methods in XAI?
Post-hoc methods are explanations applied after the model is built, like LIME or SHAP, and can be model-agnostic.
What is the difference between model-specific and model-agnostic methods?
Model-specific methods are tailored to particular model types, while model-agnostic methods can be applied to any model.
What are some examples of model-specific techniques?
Techniques include ANOVA for statistical analysis, variable importance in random forests, and partial dependence plots.
How does LIME explain model predictions?
LIME creates a simple, interpretable model around the specific instance being explained by perturbing feature values and fitting a local surrogate model.
What is the main goal of LIME’s objective function?
To create a surrogate model that is both faithful to the original complex model and simple enough to be interpretable in the local neighborhood of the instance.
What is SHAP and how does it work?
SHAP (SHapley Additive exPlanations) attributes model predictions to features using game theory principles, providing feature importance and direction of influence.
What are local vs. global explanations?
Local explanations clarify why a specific instance was predicted, while global explanations elucidate the overall model behavior.
Why is it important to evaluate explanations in XAI?
To ensure explanations accurately reflect model reasoning, meet stakeholder needs, and identify potential biases.
What are key evaluation criteria for XAI explanations?
Fidelity, comprehensibility, sufficiency, and trustworthiness are critical criteria for evaluating explanations.
What are the levels of evaluating explanations?
Evaluation can occur at the application level (real-world testing), human level (simplified tasks for laypersons), or function level (proxy tasks without humans).
What properties make explanations effective?
Expressive power, translucency, portability, and algorithmic complexity are properties that affect explanation quality.
What characteristics define good human-friendly explanations?
Good explanations are contrastive, causal, counterfactual, and tailored to the audience’s context.