Chapter 16-20 Cost-Sensitive Modeling Flashcards

Question 1

Q

How can we choose the class_weight values for cost-sensitive models?

P 210

Answer

A

The class weighing can be defined multiple ways; for example:
Domain expertise, determined by talking to subject matter experts.
Tuning, determined by a hyperparameter search such as a grid search.
Heuristic, specified using a general best practice.

Question 2

Q

-The scikitlearn library provides an implementation of the best practice heuristic for the class weighting. It is implemented via the ____ function.

P 212

Answer

A

compute_class_weight()

Question 3

Q

The split points of the tree are chosen to best separate examples into two groups with minimum mixing. When both groups are dominated by examples from one class, the criterion used to select a split point will see good separation, when in fact, the examples from the minority class are being ignored. How can we overcome this problem?

P 218

Answer

A

This problem can be overcome by modifying the criterion used to evaluate split points to take the importance of each class into account, referred to generally as the weighted split-point or weighted decision tree.

Question 4

Q

How do we make a decision tree sensitive to the misclassification cost?

P 222

Answer

A

Our intuition for cost-sensitive tree induction is to modify the weight of an instance proportional to the cost of misclassifying the class to which the instance belonged. Higher weights [are] assigned to instances coming from the class with a higher value of misclassification cost.

As such, this modification of the decision tree algorithm is referred to as a weighted decision tree, a class-weighted decision tree, or a cost-sensitive decision tree

Question 5

Q

The scikit-learn Python machine learning library provides an implementation of the decision tree algorithm that supports class weighting. The DecisionTreeClassifier class provides the ____ argument that can be specified as a model hyperparameter.

P 223

Answer

A

Class_weight

Question 6

Q

What’s hyperparameter “C” in SVM classifier?

P 232

Answer

A

[C] determines the number and severity of the violations to the margin (and to the hyperplane) that we will tolerate. We can think of C as a budget for the amount that the margin can be violated by the n observations. A value of C = 0 indicates a hard margin and no tolerance for violations of the margin. Small positive values allow some violation, whereas large integer values, such as 1, 10, and 100 allow for a much softer margin.

Question 7

Q

Although SVMs often produce effective solutions for balanced datasets, they are sensitive to the imbalance in the datasets and produce suboptimal models. True/False

P 232

Question 8

Q

“In SVM, a larger weighting can be used for the minority class, allowing the margin to be softer, whereas a smaller weighting can be used for the majority class, forcing the margin to be harder and preventing misclassified examples.” Explain what this statement means.

P 233

Answer

A

This has the effect of encouraging the margin to contain the majority class with less flexibility, but allow the minority class to be flexible with misclassification of majority class examples onto the minority class side if needed. That is, the modified SVM algorithm would not tend to skew the separating hyperplane toward the minority class examples to reduce the total misclassifications, as the minority class examples are now assigned with a higher misclassification cost.

The C parameter is used as a penalty during the fit of the model, specifically the finding of the decision boundary.

Question 9

Q

How can we make NN models pay more attention to examples from the minority class than the majority class in datasets with a severely skewed class distribution?

P 240

Answer

A

The backpropagation algorithm can be updated to weigh misclassification errors in proportion to the importance of the class, referred to as weighted neural networks or cost-sensitive neural networks.

Question 10

Q

The Keras Python deep learning library provides support for class weighting. The ____ function that is used to train Keras neural network models takes an argument called ____.

P 245

Answer

A

fit(),class_weight

Question 11

Q

Although XGBoost performs well in general, even on imbalanced classification datasets, it offers a way to tune the training algorithm to pay more attention to misclassification of the minority class for datasets with a skewed class distribution. True/False

P 250

Question 12

Q

XGBoost provides a hyperparameter designed to tune the behavior of the algorithm for imbalanced classification problems; this is the ____ hyperparameter.

P 254

Answer

A

Scale_pos_weight

Question 13

Q

What is the effect of Scale_pos_weight hyperparameter in XGBoost?

P 255

Answer

A

This has the effect of scaling errors made by the model during training on the positive class and encourages the model to over-correct them.

Question 14

Q

Scale_pos_weight hyperparameter of XGBoost, can help the model achieve better performance when making predictions on the positive class. What will happen if it’s pushed too far?

P 255

Answer

A

It may result in the model overfitting the positive class at the cost of worse performance on the negative class or both classes.

Question 15

Q

What’s a sensible default value to set for the scale_pos_weight hyperparameter?

P 255

Answer

A

the inverse of the class distribution.

For example, for a dataset with a 1 to 100 ratio for examples in the minority to majority classes, the scale pos weight can be set to 100. This will given classification errors made by the model on the minority class (positive class) 100 times more impact, and in turn, 100 times more correction than errors made on the majority class.

Chapter 16-20 Cost-Sensitive Modeling Flashcards

(15 cards)