Machine Learning - Model Evaluation Flashcards
Accuracy
Percent of all predictions that were correct.
Confusion Matrix
A matrix showing the predicted and actual classifications. A confusion matrix is of size LxL, where L is the number of different label values. Rows for each of the actual values cross-tabbed against columns of the predicted values.
Cross-validation: Overview
A method for estimating the accuracy of an inducer by dividing the data into k mutually exclusive subsets, or folds, of approximately equal size. The inducer is trained and tested k times. Each time it is trained on the data set minus a fold and tested on that fold. The accuracy estimate is the average accuracy for the k folds.
Cross-validation: How
Leave-one-out cross validation
K-fold cross validation
Training and validation data sets have to be drawn from the same population
The step of choosing the kernel parameters of a SVM should be cross-validated as well
Model Comparison
…
Model Evaluation: Adjusted R^2 (R-Square)
The method preferred by statisticians for determining which variables to include in a model. It is a modified version of R^2 which penalizes each new variable on the basis of how many have already been admitted. Due to its construct, R^2 will always increase as you add new variables, which result in models that over-fit the data and have poor predictive ability. Adjusted R^2 results in more parsimonious models that admit new variables only if the improvement in fit is larger than the penalty, which improves the ultimate goal of out-of-sample prediction. (Submitted by Santiago Perez)
Model Evaluation: Decision tables
simplest way of expressing output from machine learning, cells in table represent the resulting decision based on the row and column which represent the conditions
Model Evaluation: Mis-classification error
Define Test error as summed error for classification (false prediction)
Model Evaluation: Negative class
Negative means not having the symptoms
Model Evaluation: Positive class
presence of something we are looking for - 1
Model Evaluation: Precision
Of all predicted positives, how many are positive?
Model Evaluation: Recall
Of all positives how many were predicted as positive?
Model Evaluation: True negative
Hypotheses correctly predicts negative output.
Model Evaluation: True positive
Hypotheses correctly predicts positive output.
Model Selection Algorithm
algorithm that automatically selects a good model function for a dataset.