Chapter 8 - Visualizing Model Performance Flashcards
Why would you use of using rankings instead of classifications
- The model gives a score that ranks cases by their likelihood of beloning to a class of interest, but which is not a true probability.
- Not able to obtain accurate probability estimates from classifier.
- Costs and benefits cannot be specified precisely, but we still want to take actions.
How do w choose a proper threshold?
- Threshold is determined where the EV is above desired level (usually 0).
Assumption: We have accurate probability estimates and well-speicfied cost-benefit matrix.
2.
What effect does a threshold have on a Ranking Confusion Matrix
Whenever the threshold changes, the confusion matrix may change as well due to the number of True Positives and False Negatives change.
What is a Profit Curve?
A visualization of all the percentage of the list predicted as positive and the corresponding EV. This curve takes the ranking threshold into account, which shows more positives as the threshold lowers.
What is a Profit Curve?
A visualization of all the percentage of the list predicted as positive and the corresponding EV. This curve takes the ranking threshold into account, which shows more positives as the threshold lowers. Multiple Classifiers can be compared within this graph.
How does a budgetary constraint affect your ranking strategy?
It can change the operating point and the choice of classifier.
Steps:
1. Calculate the number of budget per individual/instance.
2. Calculate the % of individuals you can target of the total customers.
(P. 213)
When do you use Profit Curves?
When you know the conditions under which a classifier will be used and the profit calculation conditions are expected to be stable.
What are the two critical conditions of using profit calculations?
- Class proirs (aka Base Rate): % of positive/negative instances in the target population.
- Costs and benefits: expected profit is sensitive to the relative c/b-levels.
When do you use the ROC graph?
When there is uncertainty in the profit calculations. They are used in for: Classifications, class probability estimations, and scoring.
What is a ROC graph?
This is a two-dimensional plot of a classifier with False Positive rate on the x-axis and True Positives on the y-axis. It shows the trade-off between benefits (True Positives) and costs (False Positives).
What is a discrete classifier?
A classifier that outputs only one class label instead of a ranking. These classifier produce confusion matrices.
What do the points of classifiers on the ROC graph tell you?
Northwest = superior to the others
Lefthand side: Conservative: often low True Positives and False Positives
Right upperhand = Permissive: often high False Positive rates
What is an advantage of ROC graphs?
They decouple classifiers performance from the conditions under which the classifiers will be used.
What is the AUC?
Stand for Area Under the ROC Curve and values from zero to one. This can be used to summarize performance of a classifier into one number. A value of 0.5 corresponds to randomness
What can you use to summarize the predictiveness of a classifier?
The AUC (aka the Wilxocon measure).