Quiz 3 Flashcards
three main types of outcomes of interest
predicted numerical value
predicted class membership
propensity (probability when categorical)
predicting class membership
classification
new records most likely to be part of class
ranking
predictive accuracy measures
mean absolute error/deviation, average error, mean absolute percentage error, root mean squared error, total sum of squared errors
compares the model’s predictive performance to a baseline model that has no predictors
lift chart
what is a lift chart looking for?
subset of records that has the highest cumulative predicted values
observation that belongs to one class but model put it in another
misclassification error
summarizes the correct and incorrect classifications that a classified produced for a certain dataset (validation)
classification/confusion matrix
overall error rate
incorrect/total
error
actual - prediction
ability to correctly detect all important class members
sensitivity of classifier
ability to rule out negative class members correctly
specificity of classifier
plots 1 - sensitivity and specificity
ROC curve
ability to detect only the important class members
precision of classifier
average cost of miscalculation per classified observation
average misclassification cost
score model to validation set that is random or score model to an oversampled validation set and reweight the results to remove the effects of oversampling
how to adjust for oversampling
how many responders from whole data is a sample responder worth
oversampling weights
accurately classify the most interesting/important cases
goal of ranking
actual is no, predicted is no (0,0)
true negative
actual is yes, predicted is yes (1,1)
true positive
actual is yes, predicted is no (1,0)
false negative
actual is no, predicted is yes (0,1)
false positive
accuracy
1 - error
error with confusion matrix
(false negative + false positive) / total
if it is important to predict positive values correctly what should you do?
lower the cut-off
sensitivity
true positive / (true positive + false negative)
F1
(2 * precision * recall) / (precision + recall)
what is the ideal part of a ROC curve?
top lift, high sensitivity and high specificity