Interpreting ML Book Cards Flashcards by Kristen Vinh

What does each coefficient represent in a linear model?

Each coefficient quantifies the change in the outcome variable for a one-unit change in that feature, assuming other variables are held constant.

How well did you know this?

Not at all

Perfectly

Why is standardization important in linear models?

Standardization (scaling features to have zero mean and unit variance) helps in comparing the relative importance of features by making coefficients comparable.

How well did you know this?

Not at all

Perfectly

What are the key assumptions for interpreting linear models?

Linearity, no multicollinearity, homoscedasticity (constant variance of errors), and normality of residuals.

How well did you know this?

Not at all

Perfectly

How does high multicollinearity affect linear models?

High multicollinearity can make coefficients unstable and inaccurate, which can be mitigated by using techniques like ridge or lasso regression.

How well did you know this?

Not at all

Perfectly

What is the structure of a decision tree?

Decision trees split data based on feature values to create nodes and branches, leading to leaf nodes that provide the final prediction.

How well did you know this?

Not at all

Perfectly

How do decision trees handle interpretability?

Decision trees use if-then-else conditions that correspond to paths from the root to leaf nodes, making them easy to visualize and understand.

How well did you know this?

Not at all

Perfectly

What is a common limitation of decision trees?

Decision trees can overfit the data, capturing noise rather than underlying patterns, which can be mitigated by pruning.

How well did you know this?

Not at all

Perfectly

What is LIME and what does it do?

LIME (Local Interpretable Model-agnostic Explanations) explains predictions of any black-box model by creating an interpretable local approximation of the model around a specific instance.

How well did you know this?

Not at all

Perfectly

How does LIME create explanations?

LIME perturbs the instance by making small random changes to its feature values and fits a simple model (e.g., linear) to approximate the complex model’s behavior around that instance.

How well did you know this?

Not at all

Perfectly

What are some limitations of LIME?

LIME’s explanations are local, not global, and its effectiveness depends on how well the perturbed data samples the local space of the instance.

How well did you know this?

Not at all

Perfectly

What are Shapley values?

Shapley values are derived from game theory to fairly attribute the output of a model to its input features, showing each feature’s contribution to a prediction.

How well did you know this?

Not at all

Perfectly

What are the key properties of Shapley values?

Efficiency, symmetry, dummy, and additivity—these properties ensure fair attribution of the model’s output to features.

How well did you know this?

Not at all

Perfectly

What are the limitations of Shapley values?

They are computationally intensive, especially with many features, and assume feature independence, which may not always hold.

How well did you know this?

Not at all

Perfectly

What is SHAP and how does it relate to Shapley values?

SHAP (SHapley Additive exPlanations) is an implementation of Shapley values tailored for machine learning models, offering efficient computation and a unified interpretation framework.

How well did you know this?

Not at all

Perfectly

How does SHAP provide interpretations?

SHAP values represent additive feature attribution, decomposing a prediction into contributions from each feature, with specific methods like Kernel SHAP, Tree SHAP, and Deep SHAP.

How well did you know this?

Not at all

Perfectly

What are some visualization tools used with SHAP?

Study These Flashcards

Force plots, summary plots, and dependence plots are used to visualize how features contribute to individual predictions and overall model behavior.

What are the benefits of using SHAP?

Study These Flashcards

SHAP provides consistent and unbiased explanations, handles feature interactions, and has fast computation methods, especially for tree-based models.

What are the limitations of SHAP?

Study These Flashcards

SHAP values can be overwhelming with many features, require good understanding of data and model, and are sensitive to data distributions and feature independence assumptions.

How do you interpret the coefficient of a feature in a standardized linear model?

Study These Flashcards

The coefficient represents the number of standard deviations the outcome variable will change for a one standard deviation increase in the feature, assuming all other variables are constant.

What is homoscedasticity and why is it important in linear models?

Study These Flashcards

Homoscedasticity means that the variance of the errors is constant across all levels of the independent variables; it’s important because it ensures that the model’s predictions are reliable and unbiased across the range of data.

What methods can be used to handle multicollinearity in linear models?

Study These Flashcards

Ridge regression (adds L2 regularization) and Lasso regression (adds L1 regularization) are common methods to stabilize coefficients and reduce multicollinearity effects.

What is the Gini impurity in decision trees?

Study These Flashcards

Gini impurity measures the likelihood of an incorrect classification of a randomly chosen element if it were randomly classified according to the distribution of labels in the node. A lower Gini impurity indicates a purer node.

How do decision trees determine feature importance?

Study These Flashcards

Decision trees determine feature importance based on how effectively each feature splits the data, using metrics like Gini impurity or information gain. Features that result in more homogeneous splits are deemed more important.

What is pruning in decision trees and why is it used?

Study These Flashcards

Pruning involves cutting off parts of the tree that have little predictive power to reduce overfitting and improve the tree’s ability to generalize to new data.

What is information gain and how is it used in decision trees?

Information gain measures the reduction in entropy or uncertainty from a node split. It is used in decision trees to choose the best feature that splits the data most effectively.

What are surrogate models in the context of LIME?

Surrogate models in LIME are simpler models (like linear models) that approximate the behavior of complex models locally around a specific instance to make the model's predictions interpretable.

How does LIME handle categorical features during perturbation?

LIME handles categorical features by sampling values from the observed distribution of the feature in the dataset, ensuring the perturbed data remains realistic and relevant to the specific instance.

Why is it important to choose appropriate perturbations in LIME?

Appropriate perturbations are crucial in LIME because they ensure that the local surrogate model accurately captures the complex model's decision boundary around the instance being explained.

What are the computational challenges associated with Shapley values?

Calculating Shapley values requires evaluating the model's output across all possible subsets of features, making it computationally expensive, especially for models with many features or complex interactions.

How does Kernel SHAP differ from Tree SHAP?

Kernel SHAP is model-agnostic and uses a weighted linear regression approach to estimate Shapley values, while Tree SHAP is optimized specifically for tree-based models, providing exact Shapley values by leveraging the tree structure.

How does Deep SHAP extend Shapley values to neural networks?

Deep SHAP combines Shapley values with DeepLIFT, a method for feature attribution in neural networks, to provide explanations for deep learning models by backpropagating contributions through the network layers.

What does a force plot in SHAP show?

A force plot in SHAP shows how each feature pushes the prediction higher or lower compared to the average prediction, visually representing the additive contributions of each feature to the final prediction.

How do SHAP dependence plots help in interpreting feature interactions?

SHAP dependence plots show the relationship between SHAP values of a feature and the feature's value, allowing you to see how the impact of a feature changes in relation to other features, revealing interaction effects.

What makes SHAP values unbiased and consistent for feature attribution?

SHAP values satisfy properties like efficiency, symmetry, dummy, and additivity, which ensure fair and consistent attribution of the prediction to input features across different subsets and combinations of features.

How do SHAP summary plots provide insights into model behavior?

SHAP summary plots aggregate SHAP values across all instances in the dataset, showing the distribution and magnitude of feature impacts, with colors indicating feature values, helping to identify key features and patterns in the data.

Interpreting ML Book Cards Flashcards

(35 cards)