7 - Model Evaluation Flashcards
What is the primary focus of model evaluation in data science?
To evaluate the usefulness of models in making predictions.
What is the difference between evaluation and validation?
Validation ensures consistency between training and test data sets, while evaluation measures accuracy and error rates.
What type of models are discussed in the context of evaluation measures in this chapter?
Classification models, specifically decision trees.
What is a contingency table?
A table that summarizes the performance of a classification model by comparing predicted and actual outcomes.
Define the terms TN, FP, FN, and TP in the context of a contingency table.
- TN: True Negatives
- FP: False Positives
- FN: False Negatives
- TP: True Positives
What does accuracy measure in model evaluation?
The proportion of correct classifications made by the model.
What is the formula for calculating the error rate?
Error Rate = 1 - Accuracy.
What do sensitivity and specificity measure in classification models?
- Sensitivity: Ability to classify positive records correctly
- Specificity: Ability to classify negative records correctly
How is precision defined?
Precision = TP / TPP.
What does recall measure?
Recall is another term for sensitivity, measuring the proportion of actual positives captured by the model.
What are Fβ scores used for?
To combine precision and recall into a single measure.
What does F1 score represent?
The harmonic mean of precision and recall, with equal weighting.
What is the method for model evaluation?
- Develop the model using the training data set
- Evaluate the model using the test data set
What is the target variable in the clothing data example?
Response, coded as 1 for positive and 0 for negative.
What are the three continuous predictors in the clothing data example?
- Days since Purchase
- # of Purchase Visits
- Sales per Visit
What does the accuracy of Model 1 indicate?
Model 1 has an accuracy of 0.8410 or 84.10%.
What is the baseline performance for the All Negative Model?
Accuracy = TAN / GT.
What does a specificity of 0.9541 indicate about Model 1?
The model correctly classified 95.41% of actual negative records.
What is indicated by Model 1’s sensitivity of 0.2804?
Only 28.04% of actual positive records were classified as positive.
How is precision calculated for Model 1?
Precision = TP / TPP.
What does the F1 score of 0.372 signify?
It reflects the balance between precision and recall for the model.
What are the steps to perform model evaluation using R?
- Develop Model 1 using training data
- Run test data through Model 1
What command is used to create a contingency table in R?
table() command.
What command is used to create a contingency table in R?
table() command