ML - Bias vs. Variance Flashcards
Explain each component of this equation:
Y = f(X) + e
We know that we want to find a function f(X) to predict Y:
Y = f(X) + e
Where e is the prediction error term and it’s normally distributed with a mean of 0.
What is the expected squared error of the following equation?
Y=f(X) + e
Decompose the following equation into its 3 main error components:
Thus, prediction error can be broken down into 3 parts:
- Bias Error
- Variance Error
- Irreducible Error = Noise in data = may be caused by unknown variables that influence the mapping of input variables to output variables
Draw the picture with 4 targets and label each one with High / Low Bias and High / Low Variance…which target is considered underfitting / overfitting?
Draw 3 scatterplot diagrams.
Which one is overfitting / underfitting / just right?
Which one is high variance / high bias / low variance+bias?
Plot Error vs Model Complexity.
Draw 3 curves, one each representing Total Error / Variance / Bias2.
Bias adds / subtracts terms from the model to make the target function easier to learn.
Bias are simplifying assumptions (subtracts terms) made by the model to make the target function easier to learn.
Bias makes models fast / slow.
Bias makes models fast (or simpler).
Bias makes models more simple / complex.
Bias makes models more simple.
Bias leads to overfitting / underfitting of the training data.
Bias leads to underfitting of the training data.
Bias leads to low / high error on the training + test data.
Bias leads to high error on the training + test data.
Bias can occur because of high / low # of parameters.
Bias can occur because of low # of parameters.
Bias can occur because of high / low amount of training data.
Bias can occur because of low amount of training data.
Bias can occur because of fitting a ______ function to _____ data.
Bias can occur because of fitting a linear function to non-linear data.
More assumptions made about target function leads to high / low bias.
More assumptions made about target function leads to high bias.
Less assumptions made about target function leads to high / low bias.
Less assumptions made about target function leads to low bias.
Models with low bias are:
Models with low bias are:
- Decision Trees
- k-Nearest Neighbors
- SVMs
Models with high bias are:
Models with high bias are:
- Linear / Logistic Regression
What is Variance?
Variance is the amount that the estimate of the target function will change if different training data are used. Ideally, we don’t want the target function to change too much from one training set to the next.