Data Mining - Chapter 8 (Naive Bayes) Flashcards
What is the Naive Bayes Classification?
It is a classifying approach for categorical (!!) predictors. You look at the predictor variables of the other records in your dataset and your algorithm searches for records with the same predictor profile. These records from the trainingset belong to a class. Your new records will be classified in the same class as the records with the similar predictor profile. This is ofcourse based on statistics.
What is classification?
Training a model that can classify new records in predefined classes
What is clustering?
Grouping records in clusters based on their similarities. This means there are no predefined classes.
What is meant by conditional probability?
The probability of an event A, given that event B has occured.
- > Can we play tennis, given that the weather is sunny and windy.
- > With Naive bayes you can compare the probability a new records falls in a specific class with the other classes, while having the same predictor profile. The class with the highest probability is the right one.
What is the main difference between the full bayes and the naive bayes?
In the first one you are looking for conditional probabilties for records that match the exact predictor profile as the new records. So, you calculate the complete probability based on the exact combination of predictors.
Naive Bayes uses the whole dataset and looks at the probability of each predictor in relation to the outcome variable individually.
Why can we leave out the denominator in computing probabilities within Naive Bayes?
They are seen as a constant since it does not depend on y AND the values of the features (X1, X2 etc.) are given.
What is meant by the Naive conditional independence?
The assumption that all features are independent, given class label Y
How do you calculate in which class a new record belongs?
1. You calculate the probability of that class happening (number of records in that class / total records in trainingset)
2. You calculate the probability of each individual predictor variable in relation to the specific class outcome (number of records with your desired predictor value and desired class outcome / total records with your desired class outcome)
- Do this for each predictor of that records. Then multiply the probability of step 1 times the probabilities of step 2.
- Do it for every class. The class with highest probability is the class your record will be put in.
What are the advantages of Naive Bayes classfier?
- Simplicity
- Coputational efficiency
- Good classification performance, especially when having many predictor variables
- Ability to handle categorical variables
What are the disadvantages of Naive Bayes classifier?
- It requires a large number of records to obtain good results
- Performs well in classifying or ranking. Not as well in estimating the probability of class membership.
- If a predictor category is not available in the trainingset, it assumes a probability of 0 for a new record that does have that predictor category. -> it outvotes the other predictors and the probability will be 0 overall. Happens as well if an attribute value does not occur for your desired class.
Why is naive bayes rarely used in credit scoring?
Because it does not do well in propensity, which is the estimating of the probability of class membership.
How can we solve the problem of a probability of 0 in Naive Bayes?
By using Laplace smoothing.
Add a small correction (of 1) to the numerator.
What is the basic formula of conditional probability?
P(a | c) = count(a,c) / count (c)