Bayes Classifier Flashcards
What is the prior probability?
It’s the probability before you run a test. Usually the probability of an event to happen without considering any context.
Why does Naive Bayes is considered “naive”?
Because it assumes that the features are independent (a very strong assumption), so it ignore any correlation that could be pressent between features.
Does multicolinearity impact the performance of a Naive Bayes Classifier? If so, how?
Yes, it will affect the performance of Naive Bayes.
It is called Naive because it assumes an independence between the features, which in practice is rarely the case. However, it’s shown to be fairly robust to this and to be able to perform well on real-world problems. So having correlation will go against the Naive assumption.
However, correlation is not necessarily a bad or a good thing for the performance of your model. Correlation between features in Naive Bayes simply means that if one feature “says” it’s class A, then the other feature(s) will often say the same. Therefore, if your correlated features happen to be good predictors, your model will actually benefit from it, if they happen to be bad predictors, your model will be worse off.