10: Cost-sensitive Evaluation, Decision-making and Learning Flashcards
Question 1
Level: medium
Which of the following statements is TRUE?
a) If cb_10 > cb_00 and cb_01 < cb_11, then the optimal policy is to label all examples as negative (predict Y = 0).
b) If cb_10 < cb_00 and cb_01 > cb_11, then the optimal policy is to label all examples as positive (predict Y = 1).
c) If cb_00 = cb_01 = 0 and cb_11 < cb_10 < 0, then the optimal policy is to label all examples as positive (predict Y = 1).
d) The reasonableness conditions stipulate that cb_10 < cb_00 and cb_01 < cb_11.
d) The reasonableness conditions stipulate that cb_10 < cb_00 and cb_01 < cb_11.
cb00 cb01
cb10 cb11
cb10 < cb 00; cb01 < cb11 ‘reasonableness conditions’ They imply that neither row in the matrix dominates the other
Which statement(s) are NOT true?
1. a) The average profit depends on the threshold that is used to turn conditional probability estimates into class predictions.
2. b) The empirical, profit-maximizing threshold can be different from the theoretical, cost-sensitive classification threshold.
3. c) A two-threshold point strategy aims at assisting model-based decision- making in the “middle” score range of the model, where the accuracy of the model is low.
4. d) The higher the accuracy of a predictive model, the higher the average profit.
? d) The higher the accuracy of a predictive model, the higher the average profit.
Explanation:
Accuracy is a measure of correct predictions over the total predictions, which may not necessarily align with maximizing profit.
Profit is influenced by various factors, including the costs associated with different types of errors (false positives and false negatives).
A model with high accuracy might still yield suboptimal profit if it misclassifies instances with high associated costs.
Question 1
What is the cost-benefit matrix? How is it different from a cost matrix?
- = monetary consequences of correctly or incorrectly classifying an instance
- Costs and benefits are typically different for positive and negative class instances
- I.e., we typically have
imbalanced (mis-) classification costs: - 𝑪𝑩𝑭𝑵 ≠ 𝑪𝑩𝑭𝑷
- 𝑪𝑩𝑻𝑷 ≠ 𝑪𝑩𝑻𝑵
Note: in literature, often a Cost matrix is used instead of a Cost- Benefit matrix (cf. Section 4)
- 𝑪𝒂: fixed administrative cost for investigating a transaction or contacting a customer (a cost, so negative value in the CB matrix, −𝑪𝒂)
- 𝑨: the amount of the transaction (a cost, so negative value in the CB matrix, −𝑨)
Note: make sure not to count a cost twice by considering ‘opportunity costs’, i.e., what the outcome would have been if classified differently.
Question 2
What are the reasonableness conditions? What if these are violated?
Conceptually, labeling an example incorrectly should involve a cost compared to labeling it correctly,
- 𝒄𝒃𝟏𝟎 < 𝒄𝒃𝟎𝟎 These are called the ‘reasonableness conditions’
- 𝒄𝒃𝟎𝟏 < 𝒄𝒃𝟏𝟏 They imply that neither row in the matrix dominates the other
If the first reasonableness condition is violated while the second is not:
* Then for both negative and positive instances, the benefit of labeling the example
as positive is larger than for labeling it as negative: 𝒄𝒃𝟏𝟎 > 𝒄𝒃𝟎𝟎 and 𝒄𝒃𝟏𝟏 > 𝒄𝒃𝟎𝟏
* So, the optimal policy then is to label all examples positive
* If the second reasonableness condition is violated while the first is not, vice versa
Question 3
How is the cost-benefit matrix defined for fraud detection?
0,-A
-C,-C
𝑪𝒂: fixed administrative cost for investigating a transaction or contacting a customer (a cost, so negative value in the CB matrix, −𝑪𝒂)
* 𝑨: the amount of the transaction (a cost, so negative value in the CB matrix, −𝑨)
Question 4
What is meant with a baseline model, and why is this important in cost-sensitive model evaluation?
provides a reference point for assessing the relative performance
- Classification decisions that are optimal remain unchanged:
- If each entry in the matrix is multiplied by a positive constant
- Corresponds with changing the scale of measuring, e.g., currency
- If a constant is added to each entry
- Corresponds to changing the baseline to which costs are measured (shifting up or down the net resulting cost or benefit)
- By scaling and shifting entries, any CB matrix can be transformed into a ‘simpler’ matrix, CB’:
- Leads to the same classification decisions (i.e., predictions) (cf. Section 2)
- Interpretability?
Question 5
What is the difference between instance-dependent and class-dependent cost-sensitive decision-making?
instance= Costs are specific to each individual instance in a dataset.
class= Costs are associated with predicted classes
- In practice, costs are often instance-dependent rather than class-dependent
- Instance-dependent = example-, record-, or observation-dependent
- Can be defined using an instance-dependent CB matrix, or rather, tensor
- But: instance-dependent cost-sensitive evaluation and learning: too complex?
- Simple solution: averaging instance-dependent costs yields class-dependent costs
Question 6
What is threshold tuning?
Threshold tuning refers to the process of adjusting the classification threshold in a machine learning model to achieve a desired balance between different evaluation metrics, such as precision, recall, or F1 score.
Question 7
What is the difference between the expected average profit and the expected profit?
Expected Average Profit: Average profit per instance.
Expected Profit: Total profit across all instances.
Question 8
Explain the maximum profit measure for customer churn prediction?
Maximum Profit
* Threshold tuning yields the optimal threshold, T*, that maximizes the AP
* The Maximum AP is simply called the Maximum Profit (MP)
* The MP can be used as a measure for model evaluation and selection
* The MP can be used for learning a model
The Maximum Profit (MP) measure is obtained as 𝑀𝑃 = max(𝐴𝑃)
𝛼
* With 𝛼∗ the optimal proportion of customers to target,
* Which is equivalent with the optimal, profit-maximizing threshold T*
Question 9
How to make cost-sensitive classification decisions based on expected profit?
The optimal prediction for an example 𝒙 is the class 𝒊 that yields the largest Predicted Expected Profit
Question 10
What is the optimal, cost-sensitive classification threshold?
The optimal, cost-sensitive classification threshold is the threshold that maximizes the overall profit or minimizes the overall cost when making predictions. In a cost-sensitive classification framework, different misclassification errors (False Positives and False Negatives) incur varying costs or have different associated benefits.
Question 11
How is cost-sensitive logistic different from regular logistic regression? Explain the rationale.
Cost-sensitive logistic regression differs from regular logistic regression in the incorporation of class-dependent misclassification costs into the model training process. In regular logistic regression, the focus is on minimizing the overall classification error, treating misclassification of each class equally. However, in cost-sensitive logistic regression, the objective is to minimize a cost function that accounts for the varying costs associated with different types of misclassifications