Lecture 8 - Probabilistic models Flashcards

1
Q

what are the two interpretations of probability

A
  1. Frequentist interpretation
  2. Bayesian interpretation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is frequentist interpretation of probability

A

Probabilities represent long run frequencies of events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is bayesian interpretation

A

Probabilities are used to quantify our uncertanty about something and are related to information rather than repeated trials.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the 4 components of bayes rule

A
  • Prior P(Y)
  • Posterior P(Y|X)
  • Likelyhood P(X|Y)
  • Evidence P(X)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what would the 4 components represent in the spam vs ham example

A
  • Y is Ham/Spam
  • X representing features (words in email)
  • P(Y) is the Prior probability of recieving a spam and ham email in general
  • P(X) is the Evidence probability of a specific email with certain words
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When should i use likelyhood and when should i use posteriori

A

Use likelyhoods if you want to ignore the prior distrubution or assume it uniform, and posteroir probabilities otherwise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

posteroir ordds = likelihood ratio x prior odds

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What re the two types of proabbility density functions of the Gaussian distribution

A
  • Uni-variate
  • Multi-variate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is maximum-likelyhood

A

Maximum-likelihood decision rule for classification using the likelihood ratio:
LR(x)= P(x|+)/P(x|-)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is MLE

A

Maximum likelihood estimation is a method of estimating the paramters of a probability distribution by maximising a likelihood function, so that under the assumed statistical model the observed data is most probable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is multivariate Bernoulli distribution

A

Models whether or not a word occurs in a document. The joint distibution over the bit vector.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Multinomial distibution

A

Models how many times the word exists. Joinst distribution over a count vector: a histogram of the number of occurrences of all vocabulary words in a document.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the Naive Bayes assumption

A

Naive Bayes assumption features are independent from each other. Multinomial and Multivariate Bernoulli models assume that variables are drawn independently from the same categorical distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Is naive bayes assumption true?

A

NO
In general words are not drawn independently from a dictionary of words.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly