Chapter 26 Probability Scoring Metrics Flashcards

Question 1

Q

Log loss, also called ____, ____, or ____ can be used as a measure for evaluating predicted probabilities. Each predicted probability is compared to the actual class output value (0 or 1) and a score is calculated that penalizes the probability based on the distance from the expected value. The penalty is logarithmic.

P 261

Answer

A

logistic loss, logarithmic loss, cross-entropy

Question 2

Q

The log loss can be implemented in Python using the ____ function in scikit-learn.

P 262

Answer

A

log_loss()

In the binary classification case, the function takes a list of true outcome values and a list of
probabilities as arguments and calculates the average log loss for the predictions.

Question 3

Q

Why is log loss not suitable for imbalanced data?

P 263

Answer

A

As an average, we can expect that the score will be suitable with a balanced dataset and misleading when there is a large imbalance between the two classes in the test set.

Question 4

Q

In Brier Scoring, predictions that are farther away from the expected probability are penalized, but more severely as in the case of log loss. True/False

P 265

Answer

A

False, Predictions that are further away from the expected probability are penalized, but less severely as in the case of log loss.

Question 5

Q

The Brier score can be calculated in Python using the ____ function in scikit-learn.

P 265

Answer

A

Brier_score_loss()

The skill of a model can be summarized as the average Brier score across all probabilities predicted for a test dataset. This function, takes the true class values (0, 1) and the predicted probabilities for all examples in a test dataset as arguments and returns the average Brier score.

Question 6

Q

Why Brier score can be misleading when there’s a large imbalance between the classes?

P 266

Answer

A

Model skill is reported as the average Brier across the predictions in a test dataset. As with log loss, we can expect that the score will be suitable with a balanced dataset and misleading when there is a large imbalance between the two classes in the test set.

Question 7

Q

The Brier error score is always between ____ and ____, where a model with perfect skill has a score of ____.

P265

Answer

A

0.0, 1.0, 0.0

Question 8

Q

What’s Brier Skill Score (BSS)?

P 268

Answer

A

The Brier Skill Score reports the relative skill of the probability prediction over the naive forecast.
BSS = 1 − ( BS/ BS_ref )
Where BS is the Brier skill of model, and BS_ref is the Brier skill of the naive prediction.

Question 9

Q

When does tuning the threshold become important?

P 269

Answer

A

Tuning the threshold by the operator is particularly important on problems where one type of error is more or less important than another or when a model makes disproportionately more or less of a specific type of error.

Question 10

Q

The Receiver Operating Characteristic, or ROC, curve is a plot of ____ versus ____ for the predictions of a model for multiple thresholds between 0.0 and 1.0.

P 269

Answer

A

the true positive rate, the false positive rate

Question 11

Q

What does ROC-AUC show?

P 271

Answer

A

The integrated area under the ROC curve, called AUC or ROC AUC, provides a measure of the skill of the model across all evaluated thresholds.

Question 12

Q

The ROC-AUC score can be calculated in Python using the ____ function in scikit-learn.

P 271

Answer

A

Roc_auc_score()

Question 13

Q

An ROC-AUC score is a measure of the likelihood that the model that produced the predictions will rank a randomly chosen positive example above a randomly chosen negative example. Specifically, that the probability will be higher for a real event (class = 1) than a real non-event (class = 0). This is an instructive definition that offers two important intuitions:
Naive Prediction.(under ROC AUC)
Insensitivity to Class Imbalance.
Explain what each of these intuitions mean.

P 271

Answer

A

A naive prediction under ROC AUC is any constant probability. If the same probability is predicted for every example, there is no discrimination between positive and negative cases, therefore the model has no skill (AUC=0.5).
ROC AUC is a summary on the models ability to correctly discriminate a single example across different thresholds. As such, it is unconcerned with the base likelihood of each class.

Question 14

Q

Why is ROC_AUC a better tool for model selection rather than in quantifying the practical skill of a model’s predicted probabilities?

P 272

Answer

A

An important consideration in choosing the ROC AUC is that it does not summarize the specific discriminative power of the model, rather the general discriminative power across all thresholds. It might be a better tool for model selection rather than in quantifying the practical skill of a model’s predicted probabilities.

Chapter 26 Probability Scoring Metrics Flashcards

(14 cards)