Evaluation Flashcards
What is formal definition of overfitting?
A predictor F is overfit if we can find another predictor F’ where:
- Etrain(F’) > Etrain(F)
- Egen(F’) gen(F)
What is formal definition of underfitting?
Can find another predictor F’ with smaller Etrain and Egen
How is Etrain (training error computed)?
How is Egen calculated (generalization error)?
How can we estimate Egen?
Set aside test set and compute Etest (same way as Etrain)
lim Etest = Egen as the size of the test set -> infinity
How can you compute the confidence interval for Egen from Etest
What do we use training/validation/testing sets
for?
- Training set: construct classifier
- Validation set: pick algorithm + tune hyper parameters
- Testing set: estimate future error rate
How does cross-validation work?
- Randomly split data into k sets
- Test on one portion (train on k-1 others)
- Average error over all k folds
- Final classifier is trained on all date
What is leave-one-out?
Cross validation where k = # of training instances
What is the problem with leave-one-out validation?
Classes not balanced
Testing { 1 of A, 0 of B } vs training: { n/2 of B, n/(2-1) of A }
We would always predict B (most frequent), but we will always be wrong
What does stratification do?
Keeps class labels balanced across training/testing sets
How do you do stratification?
- Split instances by class
- Split class into K parts
- Assemple ith fold by combining 1 part from each path
What is true positive?
Classifier predicts positive, and it is positive
What is true negative?
Classifier predicts negative, and it is negative
What is false positive?
Classifier predicts positive, but they are negitive