15-EvaluationII Flashcards
What is the generalised error formula? Split into major error components
Error = model bias ^2 + model variance ^2 + irreducible error
What is model bias?
Model bias refers to the presence of systematic errors in a model that can cause it to consistently make incorrect predictions
What is model variance?
Model variance refers to the amount that the estimate of a target function will change if different training data was used
Does model bias lead to underfitting or overfitting?
Underfitting
Does model variance lead to underfitting or overfitting?
Overfitting
What is the bias-variance trade-off?
As model complexity rises, bias decreases, variances increase inversely
What is a learning curve?
Plot of learning performance over increasing size of training dataset. x-axis number of training instances. y-axis is metric (error, accuracy)
How should underfitting be addressed?
Use more complex model
Add features
Boosting
How should overfitting be addressed?
Add more training data
Reduce features
Reduce model complexity
Bagging
How do we control for bias and variance?
Change holdout partition size. More training data is more evaluation variance. Less training data is less variance.
Use cross-validation: Less variance
Stratification: Less bias
Leave one out cross validation: No sampling bias, lowest bias, variance in general
What is evaluation bias?
Evaluation bias refers to the systematic error in the evaluation of a model that results in consistently overestimating or underestimating the true performance of the model.