Interpretability Flashcards

1
Q

What is the difference between interpretability and explainability in machine learning models?

A

Interpretability refers to the ability to understand how a model makes its predictions by examining its internal structure, while explainability involves providing reasons for model predictions in human-understandable terms, often through post-hoc methods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain the concept of fairness in machine learning and why it is important.

A

Fairness in machine learning involves ensuring that decisions do not disproportionately favor or harm certain groups, especially when sensitive attributes like race, gender, or age are involved. It is important because biased models can lead to unethical or illegal outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the accuracy-interpretability trade-off, and how does it manifest in different models?

A

The accuracy-interpretability trade-off reflects the observation that more complex models, like neural networks or ensembles, tend to have higher predictive accuracy but are harder to interpret, while simpler models like linear regression are more interpretable but may have lower accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does a Partial Dependence Plot (PDP) provide insight into the relationship between input features and predictions?

A

A PDP shows the marginal effect of one or two features on the predicted outcome by averaging out the effects of all other features, providing insight into the relationship between the feature and the prediction across different values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the process of calculating Permutation Feature Importance (PFI) and its advantages over traditional feature importance methods.

A

PFI measures the importance of a feature by permuting (randomly shuffling) its values and observing the increase in the model’s error. It is advantageous because it can be applied to any model and does not rely on model-specific internal parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are Local Interpretable Model-agnostic Explanations (LIME) and how do they work to explain black-box models?

A

LIME explains individual predictions by fitting a simple, interpretable model (like linear regression) locally around the data point of interest, approximating the behavior of the more complex black-box model in that specific region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain SHapley Additive exPlanations (SHAP) and how they relate to cooperative game theory.

A

SHAP values are derived from cooperative game theory and attribute the prediction to individual features by considering their contributions across all possible combinations of features. It is a unified framework that ensures consistency and local accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why are Shapley values computationally expensive to calculate, and how is this issue typically addressed in practice?

A

Calculating Shapley values involves computing the contribution of a feature by considering all possible subsets of features, which is computationally expensive for high-dimensional data. This issue is often addressed through approximation methods like Kernel SHAP.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Discuss the differences between global and local interpretability methods and provide examples of each.

A

Global interpretability methods, like PDP and PFI, aim to explain the overall behavior of the model across the dataset, while local methods, like LIME and SHAP, explain the prediction for a specific instance. Both provide valuable but different insights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does the complexity of a model affect its interpretability, and why is this a concern in practical machine learning applications?

A

More complex models, like deep learning or ensemble methods, are less interpretable because their inner workings are harder to explain in simple terms. This can be problematic when decisions need to be justified or audited, especially in regulated industries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the significance of the kernel function in LIME, and how does it affect the weighting of points in the local neighborhood?

A

The kernel function in LIME assigns weights to points in the local neighborhood around the instance being explained. It determines how much influence nearby points have on the surrogate model and impacts the accuracy of the local explanation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why might Partial Dependence Plots (PDPs) become computationally intensive, and when are they most useful?

A

PDPs can become computationally intensive because they require averaging predictions over many possible combinations of feature values, especially when considering interaction effects between multiple features. They are most useful for small subsets of features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In the context of Permutation Feature Importance, what does it mean to ‘permute’ a feature, and how does this help assess the importance of that feature?

A

Permuting a feature involves shuffling its values across the dataset, breaking its relationship with the target variable. This helps assess the feature’s importance by observing how much the model’s performance degrades when the feature’s information is lost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do LIME and SHAP differ in their approach to explaining model predictions, and what are the advantages of each method?

A

LIME explains individual predictions by fitting local surrogate models, while SHAP attributes the prediction to features based on Shapley values from cooperative game theory. LIME is faster and more flexible, but SHAP provides a more theoretically sound explanation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the role of the General Data Protection Regulation (GDPR) in influencing the need for explainability in machine learning models?

A

GDPR requires that individuals have the right to an explanation for decisions made by automated systems. This has increased the demand for explainability in machine learning models, particularly in sensitive applications like credit scoring or hiring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are some limitations of using model-agnostic explainability methods like LIME and SHAP?

A

Model-agnostic methods like LIME and SHAP are computationally expensive, especially for large datasets. They also provide only approximate explanations, which may not fully capture the behavior of the model in all cases.

17
Q

Why is interpretability particularly important in applications like credit scoring or hiring algorithms, and how can lack of interpretability lead to harmful outcomes?

A

In applications like credit scoring or hiring, decisions made by models can have significant impacts on individuals’ lives. Lack of interpretability can lead to unintentional bias or discrimination, making it difficult to identify and correct unfair decisions.

18
Q

How do the concepts of interpretability and fairness interact, and why might ensuring one not always guarantee the other?

A

Ensuring interpretability does not guarantee fairness, as an interpretable model can still produce biased outcomes if sensitive features are used. Conversely, a fair model may not always be interpretable, depending on its complexity.

19
Q

Explain the notion of ‘black-box’ models in machine learning. Why are these models often criticized for their lack of transparency?

A

Black-box models are those whose internal workings are not easily understood, such as neural networks or ensemble methods. They are criticized for their lack of transparency, especially when used in critical decision-making processes.

20
Q

What are some key challenges in balancing the accuracy and interpretability of machine learning models?

A

Balancing accuracy and interpretability is challenging because more complex models tend to be more accurate but less transparent. Simplifying a model to make it interpretable can reduce its performance, while improving accuracy often requires increasing complexity.