machine-learning-models Flashcards
What is the problem with linear separable data in logistic regression?
When training data is linearly separable, logistic regression behaves in undesired ways.
Learning tries to find decision boundaries that separate the data: this leads over complex boundaries very prone to overfitting.
If data is linearly separable: coefficients can go to infinity and indirectly provide massive conficents and confidence in the answers.
What is linearly separable data?
Data is linearly separable if there a coefficient such that:
For all POSITIVE training data:
- Score (x) = w_hat.T h(x) > 0
For all NEGATIVE training data:
- Score (x) = w_hat.T h(x) < 0
training error (W_hat) = 0. This is very likely an overfitting. Specially with very complex models in high dimensional spaces.
more info: https://www.coursera.org/lecture/ml-classification/optional-another-perspecting-on-overfitting-in-logistic-regression-lzGuI