Lecture 10 - Model Evaluation Flashcards
What does it mean when a model has a high bias?
The model does not match the training data closely enough to be useful.
Bias - Limited flexibility
What does it mean if a model has a high variance?
It means that it matches the training data too closely.
Variance - sensitivity to specific sets
What does it mean if we’re too “fit”?
If we are too “fit” then the model conforms too much to this one data set so we can’t generalize.
What is the bias-variance trade off?
It is the effort to minimize two sources of error that prevents supervised learning algorithms from generalizing beyond their training set.
What is the irreducible error?
The bias-variance decomposition is composed of tree terms: Bias, variance, and irreducible error.
error = bais + Variance + irreducible error
How do you detect overfitting?
Use a separate set of holdout data. We spit the labelled data into two collections: training and evaluation
What are the two important properties for detecting overfitting using holdout data?
The data were not used int he training so they cant have been memorized.
They have labels, so we can review models accuracy without labeling costs.
What is cross-validation and why is it important?
It allows us to see how our model does, on average across a number of randomized trials.
This will tend toward the population average.
Why do we need to be careful using train-test splits?
We don’t want to end up with all one class in training and not in evaluation.
How do we make sure that we don’t end up with all one class in our training set?
Make several random splits.
We do this with k-fold cross validation where k can be 3, 5, 10 different splits.
Describe underfitting?
It means our model has not captured the complexity present in our training data.
You will see excellent performance on the training data and much worse performance on the test data if
The model is overfit
If you see bad performance on both the train and the test sets
The model is underfit
Training performance is almost always better than
test performance
If you have overfit data?
Collect more data
Try a similer model
Apply regularization