Chapter 15 Cost-Sensitive Learning Flashcards

1
Q

What’s cost-sensitive learning?

P 195

A

Cost-sensitive learning is a subfield of machine learning that takes the costs of prediction errors (and potentially other costs) into account when training a machine learning model.

There is a subfield of machine learning that is focused on learning and using models on data that have uneven penalties or costs when making predictions and more (uneven cost for making errors, FP & FN). This field is generally referred to as Cost-Sensitive Machine Learning, or more simply Cost-Sensitive Learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Most machine learning algorithms designed for classification assume that there is an equal number of examples for each observed class. True/False

P 196

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In cost-sensitive learning instead of each instance being either correctly or incorrectly classified, each class (or instance) is given a ____.

P 196

A

misclassification cost

Instead of trying to optimize the accuracy, the problem is then to minimize the total misclassification cost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Most classifiers assume that the misclassification costs (false negative and false positive cost) are the same. In most real-world applications, this assumption is not true. True/False

P 196

A

True

Real-world imbalanced binary classification problems typically have a different interpretation for each of the classification errors that can be made. For example, classifying a negative case as a positive (FP) case is typically far less of a problem than classifying a positive case as a negative (FN) case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define: Cost and Cost minimization.

P 198

A

Cost: The penalty associated with an incorrect prediction.
Cost Minimization: The goal of cost-sensitive learning is to minimize the cost of a model on a training dataset

Cost-Sensitive Learning is a type of learning that takes the misclassification costs(and possibly other types of cost) into consideration. The goal of this type of learning is to minimize the total cost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Although some methods from cost-sensitive learning can be helpful on imbalanced classification, not all cost-sensitive learning techniques are imbalanced-learning techniques, and conversely, not all methods used to address imbalanced learning are appropriate for cost-sensitive learning. True/False

P 198

A

True

To make this concrete, we can consider a wide range of other ways we might wish to consider or measure cost when training a model on a dataset:
ˆ Cost of misclassification errors (or prediction errors more generally).
ˆ Cost of tests or evaluation.
ˆ Cost of labeling.
ˆ Cost of intervention or changing the system from which observations are drawn.
etc.
The above list highlights that the cost we are interested in for imbalanced-classification is just one type of the range of costs that the broader field of cost-sensitive learning might consider.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s a cost matrix?

P 200

A

Cost Matrix: A matrix that assigns a cost to each cell in the confusion matrix.

we use the notation C() to indicate the cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is the cost-weighted sum of the False Negatives and False Positives calculated?

P 200

A

TotalCost = C(0, 1) × FalseNegatives + C(1, 0) × FalsePositives
This is the value that we seek to minimize in cost-sensitive learning, at least conceptually.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The effectiveness of cost-sensitive learning relies strongly on the supplied____.

P 201

A

cost matrix (C(0, 1), C(1, 0), C(1, 1), C(0, 0)), Parameters provided there will be of crucial importance to both training and predictions steps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In some problem domains, defining the cost matrix might be obvious. In other domains, defining the cost matrix might be challenging. Give an example for each.

P 201

A

In an insurance claim example, the costs for a false positive might be the monetary cost of follow-up with the customer to the company and the cost of a false negative might be the cost of the insurance claim. But for example, in a cancer diagnostic test example, the cost of a false positive might be the monetary cost of performing subsequent tests, whereas what is the equivalent dollar cost for letting a sick patient go home and get sicker?

Further, the cost might be a complex multi-dimensional function, including monetary costs, reputation costs, and more.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A good starting point for imbalanced classification tasks is to assign costs based on the ____. Give an example of this type of cost assignment.

P 201

A

Inverse class distribution.

For example, we may have a dataset with a 1 to 100 (1:100) ratio of examples in the minority class to examples in the majority class. This ratio can be inverted and used as the cost of misclassification errors, where the cost of a False Negative is 100 and the cost of a False Positive is 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

There are perhaps three main groups of cost-sensitive methods that are most relevant for imbalanced learning; what are they?

P 202

A
  1. Cost-Sensitive Resampling( cost-proportionate sampling) (Me: over/under sampling in a misclassification cost sensitive manner)
  2. Cost-Sensitive Algorithms
  3. Cost-Sensitive Ensembles

For imbalanced classification where the cost matrix is defined using the class distribution, there is no difference in the data sampling technique for Cost-Sensitive Resampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Machine learning algorithms are rarely developed specifically for cost-sensitive learning. What can we do to use them for cost-sensitive problems?

P 202

A

The wealth of existing machine learning algorithms can be modified to make use of the cost matrix.

Many such algorithm-specific augmentations have
been proposed for popular algorithms, like decision trees and support vector machines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The scikit-learn Python machine learning library provides examples of these cost-sensitive extensions via the ____ argument on the following classifiers: ˆ SVC ˆ DecisionTreeClassifier

P 202

A

class_weight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Using the costs as a penalty for misclassification can be used for iteratively trained algorithms, such as ____ and ____.

P 203

A

artificial neural networks
logistic regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The Scikit-learn library provides examples of cost-sensitive extensions to models via the ____ argument on the following classifiers:
ˆ ____
ˆ ____.

P 203

A

class_weight
LogisticRegression
RidgeClassifier

17
Q

What are cost-sensitive ensembles?

P 203

A

Techniques designed to filter or combine the predictions from traditional machine learning models in order to take misclassification costs into account. (Me:e.g. BalancedRandomForest, BalancedBaggingClassifier, EasyEnsemble)

Making an arbitrary classifier cost-sensitive
by wrapping a cost-minimizing procedure around it.

18
Q

Why are cost-sensitive ensembles called wrappers and meta-learners as well?

P 203

A

These methods are referred to as wrapper methods as they wrap, a standard machine learning classifier. They are also referred to as meta-learners or ensembles as they learn how to use or combine predictions from other models.

Cost-sensitive meta-learning converts existing cost-insensitive classifiers into costsensitive ones without modifying them. Thus, it can be regarded as a middleware component that pre-processes the training data, or post-processes the output, from the cost-insensitive learning algorithms.

19
Q

What’s the difference between loss function and cost function?

External

A

There is no major difference. In other words, the loss function is to capture the difference between the actual and predicted values for a single record whereas cost functions aggregate the difference for the entire training dataset.

Ref