Lecture 7 Flashcards
Random Variable
Refers to an element/event whose status is unknown
Domain
The set of values a random variable can take
Conditional probability
the chances that some outcome occurs given that another event has also occurred
Joint Probability
The probability that a set of random variables will take a specific value
What are three types of classifiers?
Instance based classifiers, generative, and discriminative
Instance based classifiers
Use observations directly without models
e.g. k nearest neighbors
Generative classifiers
build a generative statistical model
e.g. Bayes classifiers
Discriminative classifiers
directly estimate a decision rule/boundary
e.g. decision trees
Gaussian Naive Bayes classifier
assumes that features follow a normal distribution
Multinomial Naive Bayes
each feature represents an integer count of something, like how often a word appears in a sentence
Bernoulli Naive Bayes
Assumes your feature vectors are binary or continuous values which can be precisely split (binarized) with a predefined threshold
Advantages of Naive Bayes classifiers
They are simple, work well with a small amount of training data, and the class with the highest probability is considered as the most likely class
Disadvantage of Naive Bayes classifiers
Estimates parameters
What is the complexity of a decision tree model determined by?
the depth of the tree
What causes overfitting in decision trees?
Increasing the depth of the tree and thus increasing the number of decision boundries