Data Mining - Chapter 5 (Performance Measures) Flashcards
Why do we need to evaluate our models?
- Allows you to convince others that your work is meaningful
- Without strong evaluation, your idea is likely to be rejected, or your code would not be deployed
- Emperical evaluation helps guide meaningful research and development directions
What is a benefit of having a large training data set?
The larger the training data, the better the classifier
What is a benefit of having a large test data set?
The larger the test data, the more accurate the error estimate.
What do errors based on the training set tell us?
They give us information about the fit of the model
What do errors based on the validation/testing set tell us?
They measure the model’s ability to predict new data
What three types of outcomes exist in prediction through supervised learning
- Predicted numerical value
- Predicted class membership
- Propensity - probability of class membership
What do we focus on when we are evaluating predictive performance (with numerical variables)?
We measure accuracy by using the prediction errors on the validation/test set.
All the measures are based on the prediction error. For a single record this is computed by subtractig the predicted outcome value from the actual outcome value:
ei = Yi - Yihat
Which five accuracy measures are there for models that predict numerical values?
- Mean absolute error (MAE)
- Mean error
- Mean percentage error (MPE)
- Mean absolute percentage error (MAPE)
- Root mean squared error
–> Check slides for the formula’s.
What is the benefit of the mean percentage error(MPE) ?
It takes into account the direction of the error
What do you need to take into account when using any of these measures using the mean?
The measures are affected by outliers.
What is a Lift chart?
A graphical way to assess predictive performance. You use this when your goal is to search for a subset of records that gives the highest cumalative predicted values. (ranking)
-> The predictive performance is compared against a baseline model without predictors (average).
What is called the ‘lift’?
The ratio of model gains to naive benchmark gains.
What do we do when we are evaluating the performance of predicted class membership (classifiers)?
We are looking how well are model is doing, or comparing multiple models based on their accuracy in classifying records into classes.
We calculate the accuracy by subtracting the misclassification error from 1.
This is mainly done by using a confusion/classification matrix.
What is the confusion/classification matrix?
It is a matrix in which the predicted classes are compared to the actual classes. The actual classes are portrayed on the y-axis and the predicted classes on the x-axis.
The matrix will contain numbers for:
- True positive
- True negative
- False positive
- False negative
What is a Type I error?
A false positive.