chapter 2 Flashcards
what are 2 approaches to probability?
- Frequentist interpretation
- Bayesian interpretation
describe the Frequentist interpretation.
- probabilities represent long run frequencies of events.
- For example: if we flip the coin many times, we expect it to land heads about half the time.
describe the Bayesian interpretation.
- probability is used to quantify our uncertainty about something; hence it is fundamentally related to information rather than repeated trials.
- For example: if we flip the coin we believe the coin is equally likely to land heads or tails on the next toss.
______ can be used to model uncertainty about events that do not have long term frequencies? give example.
- Bayesian interpretation
- For example, we might have received a specific email message, and want to compute the probability it is spam.
how can we calculate union probability?
p(A ∨ B) = p(A) + p(B) − p(A ∧ B)
how can we calculate joint probability?
- p(A ∧ B) = p(A|B) * p(B)
- p(A ∧ B) = p(B|A) * p(A)
how can we calculate conditional probability?
- p(A|B) = p(A n B) / p(B) …. if p(B) > 0
- prob of a given b
describe bayes Prior / state of nature?
- Is the probability of an event occurring before new data is collected, based on current knowledge
- reflect the prior knowledge of how likely we will get sth
- Priors are known before the training process
- The state of nature is a random variable P(wi).
- If there are only two classes, the sum of the priors is P(w1) + P(w2) = 1
describe bayes Likelihood / Class conditional probabilities?
- It represents the probability of how likely a feature x occurs given that it belongs to the particular class, wi.
- It is denoted by, P(X|wi)
- It is the quantity that we have to evaluate while training the data.
describe bayes Evidence
- It is the probability of occurrence of a particular feature i.e. P(X).
- We also figure out evidence values during training.
how can we calculate evidence in Bayes theory?
using the chain rule as, P(X) = Σin P(X|wi)P(wi)
describe Bayes Posterior Probabilities?
It is the probability of occurrence of Class wi when certain Features are given P(wi|X)
It is what we aim at computing in the test phase in which we have testing input [ features ] and we have to find how likely trained model can predict features belonging to the particular class wi.
what should our predictions be in the case of:
1. P(ω1)»> P(ω2)
2. P(ω1)= P(ω2)
3. the probability of error
- decision should favor of ω1
- it is half probable for our prediction being right
- the minimum of P(ω1) and P(ω2)