Lecture 7 - Model Selection, Evaluation Flashcards
What are the two goals of evaluating the prediction performance?
- Model selection: almost all methods have parameters that need to be chosen
- Model evaluation: After selecting your hypothesis, estimate how well it generalizes to new data.
What does the loss function tell?
The price for predicting y’ when true value is y
What is misclassification rate?
The average zero-one loss over data set, aka the percentage of falses: 10% falses => misclassification rate 0.1
What is classification accuracy?
The fraction of correct instances, aka 1 - misclassification rate
What is the generalization error?
Expected loss over the probability distribution of our data, aka the loss on average for random new instance
What is the standard approach for generalization error?
Having a training and test data
What are some estimators of generalization error?
- Training set error
- Test set error
- CV error
- Bootstrap error
What is bias?
Expected difference between estimated and true error, ideally close to zero
What is optimistic/pessimistic bias?
Optimistic bias: systematically estimates error to be smaller than it is
Pessimistic bias: systematically estimates error to be larger than it is
What is variance?
How large the magnitudes of the differences tend to be on average.
Can method with zero bias have large variance?
Yes, negative + positive errors cancel each other
What is training set error (resubstitution error)
The average loss on the training examples.
Has high optimistic bias, since the model is chosen due to good fit to this particular data set
What is testing set error (holdout estimate)?
Splitting data to training and test, then computing average loss on test set.
Is training set error unbiased?
Only if testing a single hypothesis, testing multiple hyperparameters etc leads to optimistic bias
What is the solution to avoid optimistic bias for test set?
Splitting the data to training-validation-test
What is stratification?
Using strategy to make the data splits more similar to distribution of classes in each set.
What is Cross-Validation and why is it used?
It is used because too small test sets can be unreliable. We want as much data as possible for testing and training, so we use CV to make “multiple” training and testing sets from the same data.
What is leave-one-out cross-validation?
Using data of n instances, we use ith instance as a test set and n-1 as training sets, then we loop through all the n.