8 - Naïve Bayes Classification Flashcards
What is the basis of Naïve Bayes classification methods?
Bayes Theorem
Developed by Reverend Thomas Bayes, it updates knowledge about data parameters by combining prior knowledge with new information.
What does the prior distribution represent in Bayes Theorem?
Previous knowledge about the data parameters
It is denoted as p(Y = y*).
What is the posterior distribution in the context of Bayes Theorem?
Updated parameter knowledge after observing data
Denoted as p(Y = y* | X*).
In a dataset with predictors X and response variable Y, how many class values can Y take in the given example?
Three possible class values: y1, y2, and y3
What is the objective of using Bayes Theorem in classification?
Identify the most likely class for a combination of predictor variable values
Specifically, find which of y1, y2, or y3 is most likely for the combination X*.
What does p(Y = y* | X*) represent?
The likelihood of class value y* given observed predictor values X*
How can you classify a record using the maximum a posteriori hypothesis?
Classify as the value of Y with the highest posterior probability
What is the class conditional independence assumption?
It allows writing p(X* | Y = y*) as the product of independent events
For example, p(X* | Y = y) = p(X1 | Y = y) × p(X2 | Y = y*).
What are the two predictor variables used in the wine classification example?
Alcohol content and sugar content
What is the prior probability of a wine being Red if there are 500 red wines out of 1000 total?
0.5
What is the marginal probability of Alcohol_flag being High in the wine dataset?
0.486
What is the marginal probability of Sugar_flag being Low in the wine dataset?
0.584
What is p(Alcohol_flag High | Type Red)?
0.436
What is the conditional probability p(Sugar_flag Low | Type White)?
0.4
How does the Naïve Bayes algorithm classify a wine with low alcohol and low sugar content?
It classifies it as Red based on higher posterior probability
What is the posterior probability of a low alcohol, low sugar wine being Red?
72.15%
What is the prior probability of a wine being White?
0.5
What is the probability of a low alcohol, low sugar wine being White?
30.92%
What happens to the classification when comparing prior and posterior probabilities?
Posterior probabilities can significantly change based on new data
What is the posterior probability of a wine being Red given high alcohol and high sugar content?
25.02%
Fill in the blank: The denominator in Bayes Theorem, p(X*), is known as the _______.
Marginal probability of the data
What is the posterior probability of a wine being red given high alcohol and high sugar?
25.02%
This is calculated using the Naïve Bayes algorithm.
What is the posterior probability of a wine being white given high alcohol and high sugar?
79.53%
This indicates the Naïve Bayes algorithm classifies the wine as white.
What is the Naïve Bayes classification for low alcohol and high sugar wine?
White
This classification is based on the alcohol and sugar content.
What is the Naïve Bayes classification for high alcohol and low sugar wine?
Red
This classification is based on the alcohol and sugar content.
What is the accuracy of the Naïve Bayes model when predicting wine types?
65.93%
This is calculated from the model’s predictions on a test data set.
What is the accuracy of the Naïve Bayes model for classifying red wines?
79.32%
This is the proportion of correctly classified red wines from the test data.
What is the accuracy of the Naïve Bayes model for classifying white wines?
61.48%
This is the proportion of correctly classified white wines from the test data.
What is the baseline accuracy for the wine types if half are red and half are white?
50%
This serves as a comparison to evaluate the performance of the Naïve Bayes model.
What are the predictor variables used in the Naïve Bayes classification model?
- Alcohol_flag
- Sugar_flag
These variables are used to classify the type of wine.
What Python library is used to implement the Naïve Bayes algorithm?
sklearn
Specifically, the MultinomialNB class is used for the implementation.
What is the first step in using the Naïve Bayes algorithm in Python?
Import required libraries
Including pandas, numpy, and sklearn.
Fill in the blank: The contingency table helps to obtain the _______ needed for Naïve Bayes calculations.
marginal and conditional probabilities
These probabilities are essential for the calculations.
What R package is used for Naïve Bayes classification?
e1071
This package contains the naiveBayes function for classification.
What command in R is used to run the Naïve Bayes estimator?
naiveBayes()
This function builds the model using the specified formula and data.
What do A-priori probabilities represent in the Naïve Bayes model?
Values of p(Y)
These probabilities indicate the likelihood of each class before any evidence is considered.
What do conditional probabilities represent in the Naïve Bayes model?
Values of p(Y | X)
These probabilities indicate the likelihood of each class given the predictor variables.
What is the purpose of the predict() command in R for Naïve Bayes?
To classify each record in the test data set
This generates predictions based on the trained model.
Fill in the blank: The contingency table of actual versus predicted wine types in R is created using the _______ command.
table()
This command creates a cross-tabulation of the actual and predicted values.
What information does Bayes Theorem update about the data parameters?
Bayes Theorem updates our previous knowledge about the data parameters based on new evidence.
What does the prior probability represent?
The prior probability represents our initial belief about the probability of a parameter before observing any evidence.
What formula represents how the data behave within the target variable’s class values?
The formula representing how the data behave within the target variable’s class values is p(X | Y).
What formula represents how the data behave without reference to the class values?
The formula is p(X).
What is the formula from the previous exercise called?
The formula is called the likelihood.
What does the posterior probability represent?
The posterior probability represents the updated probability of a hypothesis after considering new evidence.
What do we use for a prior probability if we have no prior knowledge about the parameters?
We use a uniform distribution for the prior probability.
How does the maximum a posteriori hypothesis help us to classify a record?
It helps classify a record by selecting the class that maximizes the posterior probability.
What is the class conditional independence assumption?
The assumption that the features are independent given the class label.
How do we write p(X* ∣ Y = y) if we have two predictor variables X = {X1 = x1, X2 = x2}?
We write it as p(X1 = x1, X2 = x2 | Y = y*).
Create two contingency tables for which variables?
One with Type and Alcohol_flag, and another with Type and Sugar_flag.
What is the prior probability of Type = Red and Type = White?
Calculated from the contingency tables.
What can we calculate regarding alcohol content from the contingency tables?
The probability of high and low alcohol content.
What can we calculate regarding sugar content from the contingency tables?
The probability of high and low sugar content.
What are the conditional probabilities for Alcohol_flag given Type = Red?
p(Alcohol_flag = High | Type = Red) and p(Alcohol_flag = Low | Type = Red).
What are the conditional probabilities for Alcohol_flag given Type = White?
p(Alcohol_flag = High | Type = White) and p(Alcohol_flag = Low | Type = White).
What are the conditional probabilities for Sugar_flag given Type = Red?
p(Sugar_flag = High | Type = Red) and p(Sugar_flag = Low | Type = Red).
What are the conditional probabilities for Sugar_flag given Type = White?
p(Sugar_flag = High | Type = White) and p(Sugar_flag = Low | Type = White).
How likely is it that a randomly selected wine is red?
Discussed based on prior probabilities.
How likely is it that a randomly selected wine has high alcohol content?
Discussed based on prior probabilities.
How likely is it that a randomly selected wine has low sugar content?
Discussed based on prior probabilities.
What might a typical white wine have as its alcohol and sugar content?
Discussed based on conditional probabilities.
What might a typical red wine have as its alcohol and sugar content?
Discussed based on conditional probabilities.
What do side-by-side bar graphs for Type compare?
They compare Alcohol_flag and Sugar_flag.
What is the posterior probability of Type = Red for a wine that is low in alcohol and high in sugar?
Calculated based on the relevant probabilities.
What is the posterior probability of Type = White for the same wine?
Calculated based on the relevant probabilities.
Which type is more probable for a wine with low alcohol and high sugar content?
Determined from posterior probabilities.
What is the posterior probability of Type = Red for a wine that is high in alcohol and low in sugar?
Calculated based on the relevant probabilities.
Which type is more probable for a wine with high alcohol and low sugar content?
Determined from posterior probabilities.
What does the Naïve Bayes classifier classify wines based on?
Alcohol and sugar content.
How do we evaluate the Naïve Bayes model on the wines_test data set?
Display results in a contingency table.
What values do we find for the Naïve Bayes model in the contingency table?
Accuracy and error rate.
How often does the Naïve Bayes model correctly classify red wines?
Determined from the contingency table.
How often does the Naïve Bayes model correctly classify white wines?
Determined from the contingency table.
What should be done with the variables Death, Sex, and Educ?
Convert all variables to factors.
What two contingency tables should be created for the framingham_nb data sets?
One with Death and Sex, and another with Death and Educ.
What is the probability a randomly selected person is alive or dead?
Calculated from the contingency tables.
What is the probability a randomly selected person is male?
Calculated from the contingency tables.
What is the probability a randomly selected person has an Educ value of 3?
Calculated from the contingency tables.
What are the probabilities that a dead person is male with education level 1?
Calculated from the contingency tables.
What are the probabilities that a living person is male with education level 1?
Calculated from the contingency tables.
What are the probabilities that a living person is female with education level 2?
Calculated from the contingency tables.
What are the probabilities that a dead person is female with education level 2?
Calculated from the contingency tables.
What do side-by-side bar graphs for Death compare?
One with an overlay of Sex and the other with an overlay of Educ.
If we know a person is dead, are they more likely to be male or female?
Determined from the bar graphs.
If we know a person is alive, are they more likely to be male or female?
Determined from the bar graphs.
If we know a person is dead, what education level are they most likely to have?
Determined from the bar graphs.
If we know a person is alive, what education level are they most likely to have?
Determined from the bar graphs.
Which education levels are more prevalent for dead persons?
Determined from the graphs.
Which education levels are more prevalent for living persons?
Determined from the graphs.
What is the posterior probability of Death = 0 for a male with education level 1?
Calculated based on the relevant probabilities.
What is the posterior probability of Death = 1 for a male with education level 1?
Calculated based on the relevant probabilities.
What is the posterior probability of Death = 0 for a female with education level 2?
Calculated based on the relevant probabilities.
What is the posterior probability of Death = 1 for a female with education level 2?
Calculated based on the relevant probabilities.
What does the Naïve Bayes classifier classify persons based on?
Sex and education.
How do we evaluate the Naïve Bayes model on the framingham_nb_test data set?
Display results in a contingency table.
How often does the Naïve Bayes model correctly classify dead persons?
Determined from the contingency table.
How often does the Naïve Bayes model correctly classify living persons?
Determined from the contingency table.