ds-machine-learning Flashcards
What is avoidable bias? What strategies can be used to deal with it?
Avoidable bias is the difference between the training error and bayes error. We cannot go bellow bayes error without overfitting.
When avoidable bias is high, one strategy is to increase model complexity.
The difference between the training error and development error is variance. To reduce variance we could apply some regularization or increase the data.
What is bayes error in classification?
Bayes error rate is the lowest possible error rate for any classifier of a random outcome (into, for example, one of two categories) and is analogous to the irreducible error
For example, in computer vision tasks, humans are a proxy for bayes error because they are extremelly good at it.
source: https://en.wikipedia.org/wiki/Bayes_error_rate
An image classifier achieved the following results.
- Training error: 7%
- Dev. error: 9%
- Estimated human-level error: 5.5%
What is the avoidable bias?
It’s the difference between estimated training error minus human-level error:
1.5%
What are the correct statements about Bayes Error:
- Bayes error can be equal to human-level error.
- Bayes error is the lowest possible prediction error that can be achieved.
- Bayes error can be estimated after subtracting the irreducible error from the model’s error.
- Bayes error is always smaller than the training error.
1 and 2. For 1, it can happen in computer vision tasks where humans are extremelly good.
3 is not correct because bayes error is analog to irreducible error.
4 is not correct because training error can be smaller than bayes error when the model overfits.
what is the most correct about cross validation?
- Cross-validation is commonly used to tune the hyperparameters of a model.
- Cross-validation helps to estimate how well a model fits the training data.
- Cross-validation is commonly used to tune the trainable parameters of a model.
- Cross-validation helps to estimate how well a model fits data that is independent from the training data.
1 and 4.
2 is not correct because of 4. 3 is not correct because of 1. we select the parameters that constrain the model, ie hyperparameters.
source: https://web.stanford.edu/class/msande226/lecture5_prediction_annotated_2018.pdf