lecture 8 Flashcards

Question

LIME :simplified step by step

Answer 1

Consider a model **m** whose prediction **y ** for observation **x** is to be explained. LIME...: * ...generates *synthetic observations* * ...labels the *synthetic data* by passing it to the model **m** * weights the synthetic observations using the *Euclidean distance* from **x** *to the synthetic observating* (euclidean distance is just simply the shortest distance between two points). * produces a locally weighted *(linear) regression* * use the *regression coefficients* as a mean to explain prediction y ## Footnote https://paperswithcode.com/method/lime#:~:text=**LIME**%2C%20or%20**,locally%20with%20an%20interpretable%20model.

Answer 2

There is a model, from this model data is derived with a prediction. LIME explains this prediction based on certain data that is present. Then the human is able to make the decision based on the explanation that lime provides.

Answer 3

LIME helps understand how an image classification prediction is made, showing what elements of a picture are taken into account for a decision.

Answer 4

Pros: * LIME is a very popular technique for producing pixel-level attributions (= highlight the pixels that were relevant for a certain image classification by a neural network) * LIME is easy to implement and has a well-maintained pyhton implementation * LIME has an intuitive and easy-to-explain algorithm and implementation. * There are a wide variety of visualizatoin options Cons * The performance of LIME depends on the complexity of the model * The accuracy of LIME depends on the complexity of the model * LIME can have variation in explanations for the same prediction (this can cause inconsistencies)

Answer 5

**SHAP** is a widely used post-hoc explainability tool implementing **shapely values** Shapley values are based on coalition **game theory** and tell **how to fairly distribute a pay-out among players** **A coalition of players cooperate** and obtain a certain overall **gain** from that cooperation some players may **contribute more** to the coalition than others or may possess different bargaining power.

Answer 6

In ML, game theory is used to* determine a feature's responsibility for a change in the model output prediction* for this: **Shapley value provides** * the (weighted) average marginal *contribution of a feature value across all possible coalitions* * the *average change in the prediction* that a coalition (of features) receives when the feature value joins the coalition In other words, Shapley values consider all possible predictions for an instance using all possible combinations of inputs Shapley value does **not** provide: * the change in the prediction **performance** when feature is *removed* from the model

Answer 7

Imagine a ML model trained to predict apartment prices, now we want to understand how did each feature contribute to the prediction? we can make different coalitions, which result in different apartment costs, e.g. coalition for cat-banned with: {park-nearby, 50m2, 2nd floor} for which the average apartment cost would be 310.000 euro. (so a **coalition** is just a combination of features) **shapley values** then *enumerate all possible coalitions* and *compute the weighted average change in the prediction when the feature calue [cat-banned] joins the coalition*

Answer 8

For ML model/architectures: Tree models, XGBoost, scikit-learn * has high perfomance when computing Shapley values DNNs, TensorFlow * Based on DeepLIFT, approximate Shapley values, difficult to configure Differentiable models TensorFlow * Slower than DeepExplainers also approcimates Shapley values Linear regression * computes the exact Shapley value; i.e. the weights multiplied by feature values Model agnostic * more difficult to configure, and slowest SHAP Explainer in terms of computation time.

Answer 9

Pros * SHAP has strong theoretical foundation * There are commonly implemented explainability techniques available in several open-source packages as well as cloud-based option * Shapley values can be used for individual predictions, cohorts, and to globally explain the model * Values provide an intuitive understanding for stakeholders Cons * The feature influence is often combined with causality by practitioners, end users and stakeholders * SHAP is computationally intensive and difficult to use with models with more than 100 features, because it will go through every coalition possible. * choosing a good baseline can be difficult

Answer 10

**assuming causality** Most common and most dangerous. Almost no explainability technique is able to definitely establish causality for an ML model operating in the real world. **Overfitting intent** overfitting can lead to a user to have false confidence in the model. Extrapolating from an explanation, assuming the model understands concepts that are familiar to users. Often that is unlikely and can lead to the user understanfing of the model and its actual behavior. **overreaching for additional explanations** can result in confirmation bias as other explanations are misused to confirm existing expectations.

Answer 11

While LIME excels in localized insights, SHAP provides a broader understanding, which is crucial for complex models. The choice hinges on task requirements. You can pick LIME for focused, instance-level clarity and SHAP for comprehensive global and local perspectives.

Answer 12

If you’re seeking **localized** insights for individual predictions in simpler models, opt for **LIME**. If your task demands a broader understanding, encompassing both **global and local perspectives**, and involves complex models, **SHAP** is more suitable.

Answer 13

If you have a **simpler** model, consider **LIME**, as it excels in providing clear insights. Choose **SHAP** for **complex models**, including deep neural networks or ensemble methods, to gain both local and global interpretability.

Answer 14

For **simpler models** where localized interpretability suffices, **LIME** is suitable. The effectiveness of LIME provides clear insights for **individual predictions**. Use **SHAP** for **complex ML models** as it offers a **broader perspective** on feature contribution.

Answer 15

**LIME** might exhibit *instability due to random sampling*, which makes it less consistent across runs. If a *stable and consistent* interpretation is crucial, particularly in sensitive applications, **SHAP** is preferred.

lecture 8 Flashcards

(39 cards)