Lecture 8 - Probabilistic models Flashcards
what are the two interpretations of probability
- Frequentist interpretation
- Bayesian interpretation
What is frequentist interpretation of probability
Probabilities represent long run frequencies of events.
What is bayesian interpretation
Probabilities are used to quantify our uncertanty about something and are related to information rather than repeated trials.
what are the 4 components of bayes rule
- Prior P(Y)
- Posterior P(Y|X)
- Likelyhood P(X|Y)
- Evidence P(X)
what would the 4 components represent in the spam vs ham example
- Y is Ham/Spam
- X representing features (words in email)
- P(Y) is the Prior probability of recieving a spam and ham email in general
- P(X) is the Evidence probability of a specific email with certain words
When should i use likelyhood and when should i use posteriori
Use likelyhoods if you want to ignore the prior distrubution or assume it uniform, and posteroir probabilities otherwise.
posteroir ordds = likelihood ratio x prior odds
true
What re the two types of proabbility density functions of the Gaussian distribution
- Uni-variate
- Multi-variate
What is maximum-likelyhood
Maximum-likelihood decision rule for classification using the likelihood ratio:
LR(x)= P(x|+)/P(x|-)
What is MLE
Maximum likelihood estimation is a method of estimating the paramters of a probability distribution by maximising a likelihood function, so that under the assumed statistical model the observed data is most probable.
What is multivariate Bernoulli distribution
Models whether or not a word occurs in a document. The joint distibution over the bit vector.
What is the Multinomial distibution
Models how many times the word exists. Joinst distribution over a count vector: a histogram of the number of occurrences of all vocabulary words in a document.
what is the Naive Bayes assumption
Naive Bayes assumption features are independent from each other. Multinomial and Multivariate Bernoulli models assume that variables are drawn independently from the same categorical distribution
Is naive bayes assumption true?
NO
In general words are not drawn independently from a dictionary of words.