Interpreting ML Book Cards Flashcards
What does each coefficient represent in a linear model?
Each coefficient quantifies the change in the outcome variable for a one-unit change in that feature, assuming other variables are held constant.
Why is standardization important in linear models?
Standardization (scaling features to have zero mean and unit variance) helps in comparing the relative importance of features by making coefficients comparable.
What are the key assumptions for interpreting linear models?
Linearity, no multicollinearity, homoscedasticity (constant variance of errors), and normality of residuals.
How does high multicollinearity affect linear models?
High multicollinearity can make coefficients unstable and inaccurate, which can be mitigated by using techniques like ridge or lasso regression.
What is the structure of a decision tree?
Decision trees split data based on feature values to create nodes and branches, leading to leaf nodes that provide the final prediction.
How do decision trees handle interpretability?
Decision trees use if-then-else conditions that correspond to paths from the root to leaf nodes, making them easy to visualize and understand.
What is a common limitation of decision trees?
Decision trees can overfit the data, capturing noise rather than underlying patterns, which can be mitigated by pruning.
What is LIME and what does it do?
LIME (Local Interpretable Model-agnostic Explanations) explains predictions of any black-box model by creating an interpretable local approximation of the model around a specific instance.
How does LIME create explanations?
LIME perturbs the instance by making small random changes to its feature values and fits a simple model (e.g., linear) to approximate the complex model’s behavior around that instance.
What are some limitations of LIME?
LIME’s explanations are local, not global, and its effectiveness depends on how well the perturbed data samples the local space of the instance.
What are Shapley values?
Shapley values are derived from game theory to fairly attribute the output of a model to its input features, showing each feature’s contribution to a prediction.
What are the key properties of Shapley values?
Efficiency, symmetry, dummy, and additivity—these properties ensure fair attribution of the model’s output to features.
What are the limitations of Shapley values?
They are computationally intensive, especially with many features, and assume feature independence, which may not always hold.
What is SHAP and how does it relate to Shapley values?
SHAP (SHapley Additive exPlanations) is an implementation of Shapley values tailored for machine learning models, offering efficient computation and a unified interpretation framework.
How does SHAP provide interpretations?
SHAP values represent additive feature attribution, decomposing a prediction into contributions from each feature, with specific methods like Kernel SHAP, Tree SHAP, and Deep SHAP.