ML-06 - Model and feature selection Flashcards
ML-06 - Model and feature selection
How do you use validation data to check if your problem is due to high bias vs. high variance?
(See image)
ML-06 - Model and feature selection
Which area is high bias and which is high variance?
(See image)
ML-06 - Model and feature selection
If you have high bias, is your model underfit/overfit?
Underfit
ML-06 - Model and feature selection
If you have high variance, is your model underfit/overfit?
Overfit
ML-06 - Model and feature selection
If your model is underfit, do you have high bias or variance?
High bias
ML-06 - Model and feature selection
If your model is overfit, do you have high bias or variance?
High variance
ML-06 - Model and feature selection
What is the first tool to try for overfitting problems?
Regularization
ML-06 - Model and feature selection
What does regularization prevent?
Overfitting.
ML-06 - Model and feature selection
Rescribe the bias/variance as a function of the regularization lamba parameter.
ML-06 - Model and feature selection
Describe how the error vs. training set size looks for a situation with a good bias/variance trade-off.
(See image)
ML-06 - Model and feature selection
Describe how the error vs. training set size looks for a situation high bias.
(See image)
ML-06 - Model and feature selection
Describe how the error vs. training set size looks for a situation with high variance.
(See image)
ML-06 - Model and feature selection
What should you try if you have high variance? (3)
- Get more data
- Smaller sets of features (or smaller NN)
- Try increasing regularization lambda
ML-06 - Model and feature selection
What should you try if you have high bias? (3)
- Get more features
- Feature engineering, add polynomial features
- Try decreasing regularization lambda
ML-06 - Model and feature selection
What are the 3 steps of the ML design guideline?
1) Start with a small model (baseline) that’s quick to implement.
2) Decide if more data or features will help (guided by learning curves)
3) Error analysis, manually examine samples where model made errors