Underfitting and Overfitting Flashcards
4 Reasons of Overfitting
Limited Training Data
Noisy Data
Model complexity
Low Regularisation
2 Sign of Overfitting
Training error is low
Generalisation error is high
3 Overfitting Remedy
Get more training data
Clean up the training data (noise).
Simplify or regularise the model
3 Reason of Underfitting
The model is too simple to capture the underlying patterns
The model is constrained
Irrelevant data
3 Underfitting Remedy
Feed better features to the learning algorithm
Reduce the constraints on the model
Select a more powerful model, with more parameters.
2 Error Estimate for Model Selection
Optimistic Error Estimate - Not consider complexity
Pessimistic Error Estimate - Consider complexity
Model Evaluation
To estimate performance of classifier on previously unseen data (test set)
Holdout
Reserve k% for training and (100-k)% for testing.
Random subsampling - Repeated holdout
Cross validation
Partition data into k disjoint subsets
K-fold - Train on k-1 partitions, test on the remaining one
4 Step of Using Validation Set
Train multiple models with various hyper-parameters training set
Select model that performs best on validation set
Train the best model on the full training set
Evaluate final model on test set (estimate generalization error)
2 Type of Fine Tune Model
Grid Search CV - Evaluate all combinations through a specified hyperparameter space.
Randomised Search CV - Evaluate a given number of random combinations