Interpretability and Feature Importance Flashcards

1
Q

Define intrinsic vs post-hoc interpretability

A

Intrinsic means the model itself is interpretable (e.g. linear models) and post-hoc means some method is needed to understand feature importance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the regularisation term in Total Variation? and in Sparse Total Variation?

A

J(w) = |∇w| i.e. the discrete gradient of w in 3D

Sparse: J(w) = |∇w| + |w|

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Write down the regularisation term of a Laplacian model and Sparse Total Laplacian

A

Check notes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define Gini importance

A

total reduction of the impurity criterion brought by that feature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define permutation feature importance

A

Randomly shuffle values in a given feature and measure drop in performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define LIME

A

Local Interpretable Model-agnostic Explanations (LIME) perturbs samples in a given region and trains an interpretable model on the o.g. model’s predictions to check what is driving decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is SHAP?

A

Game theory approach to feature importance. Changes one feature at a time and samples to see how much that feature contributed to prediction change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain the difference between filter, wrapper, and embedded methods for feature selection

A

Filter methods: use a proxy measure to rank the features with respect to their relationship with the labels/targets/outcomes (e.g. correlation, statistical test, mutual information).

Wrapper methods: use a predictive model to score/rank the features according to their predictive power (e.g. Recursive Feature Elimination, aka RFE).

Embedded methods: perform feature selection as part of the model construction process (e.g. LASSO and Elastic-net regularization).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does stability selection work?

A

By training various sparse models on perturbed data to see which features are kept most often.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly