4 Postprocessing Flashcards

1
Q

Question 1
Level: difficult
Which of the following statements is FALSE? Give answer D if all statements are true.
a) The evaluation measures that are used in the single-sample method for evaluating model performance, e.g., the AIC and BIC measures, evaluate the trade-off between the fit of the model and its complexity. They are only meaningful to compare the performance of two models that are built on the same data set.
b) An intuitive interpretation of the AUC performance measure is that it provides an estimate of the probability that a randomly chosen instance of the positive class is correctly ranked higher than a randomly chosen instance of the negative class.
c) For evaluating the performance of a classification model, distance measures such as the divergence measure or Euclidean distance can be used for measuring the separation of the distributions of positives and negatives, taking into account both the difference between the average value of the probability estimates and the variability around the mean of the estimates.
d) All the above statements are true.

A

c) Euclidean distance between two points in Euclidean space is the length of the line segment between them

 a true,
 b true,,
 c false that is ks distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Question 2
Level: medium
Which statement about the ROC curve is TRUE?
a) The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings, and the ROC space is defined by FPR as y axis and TPR as x axis.
b) The best possible prediction method would yield a point with (x,y)-coordinate equal to (1,0) in the ROC space, representing 100\% sensitivity (no false negatives) and 100\% specificity (no false positives). The (1,0) point is also called a perfect classification.
c) The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The true-positive rate is also known as specificity or recall. The false-positive rate is equal to (1 − sensitivity).
d) The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The true-positive rate is also known as sensitivity or recall. The false-positive rate is equal to (1 − specificity).

A

d) The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The true-positive rate is also known as sensitivity or recall. The false-positive rate is equal to (1 − specificity).

 a; TPS on y-axis, FPS on x-axis
 b; perfect clasiification 0,1
 c TPR= SENS
 d = correct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Question 1
Explain the differences and similarities between the ROC, Gini and lift curve for evaluating a classification model.

A

Receiver Operating Characteristic
=graph; trade-off between true positive rate (sensitivity) and false positive rate (1-specificity) at various threshold settings for a binary classification model.
x-as= FP (1- specificity), y-as= TP (sensitivity)
the higher ROC the better the discrimination

Gini
inequality measure
1 is good, 0, 5= random model performance

Lift curve
Comparison between baseline and scorecard.
x-axis= % pop examined, y-axis= TP/ETP
A steeper lift curve indicates better model performance.

differences;
Metrics Represented: ROC focuses on true positive rate and false positive rate, Gini on the area between the ROC curve and the diagonal, and lift on the ratio of true positive rate to expected true positive rate.
Each curve provides a different perspective on model performance. ROC emphasizes the trade-off between sensitivity and specificity, Gini emphasizes the ranking of positive instances, and lift emphasizes the improvement over random guessing.

similarities;
graph: All three are graphical tools used to evaluate the performance of binary classification models.
Performance Comparison: Higher values or steeper curves in each metric generally indicate better model performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Question 2
How to evaluate a regression model?

A
  • R-square (/adjusted R) = high R is a better fit of model on data, aka independent variables explain larger proportion of variance
  • Mean square error (RMSE) = avg squared difference between predicted and actual values aka, lower MSE indicates better model.
  • Mean absolute deviation= avg absolute difference between predicted and actual values. Lower MAD indicates better model accuracy.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Question 3
Why would you adopt an out-of-time evaluation methodology for evaluating a classification model? Explain using the example of a customer churn prediction model.

A

Out-of-time-evaluation= when training data is not representative of test data, bcs some variables may lose predictive power over time (reduction of generalization power).

customer behavior may change over time due to eg. seasonality, marketing campaigns, or economic conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Question 4
What is the relation between single sample evaluation approaches, complexity and overfitting?

A

approaches are;
(1) Akaike Information Criterion (AIC)
(2) (Shwarz) Bayesian Information Criterion (S)BIC

complexity; punsih model for complexoty (aka more parameters), bcs complex model fits data better, allows for noise.
try to have small error and reduce complexity as much as possible at teh same time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Question 5
What is data leakage and how is it important in evaluating a predictive model?

A

when a model has info about test sample in in training set. Leads to high performance in training set but poor performance in new unseen data.

proper separation of training and test sets to to prevent data leakage and ensure the validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly