3. Bayesian inference, Empirical Bayes Flashcards
Write Bayes rule
What does Bayes rule allow for?
The equation yields a way to incorporate prior information into statistical models.
How do we prove bayes rule?
P(A,B) = P(AIB)P(B)
P(A,B) = P(BIA)P(A)
P(AIB)P(A) = P(BIA)P(B)
P(AIB) = P(B I A)P(A)/P(B)
Give an example where we use bayes rule
Use the Physicists Twins example:
We know that the physicists are having to two boys. She asked what is the prob. that they are identical?
Here to answer correctly, we will need to account for the fact that we know she is having two of the same sex, we cannot just rely on prior knowledge (that 1/3 twin births are identical, 2/3 are fraternal)
What is the prior based on?
Its different from every situation, sometimes we can get it from a data set with information, but sometimes we will have to estimate it
What is the frequentists definition of probability?
The probability of an even it the number of times the event occurs out of the total number of trails, as the number of trials goes to infinity - The problem here is is that not all events can be repeated numerous times.
What do we mean when we say probability as a measure of uncertainty
We know that a fair coin is 1/2 for heads
Frequency argument would be: probability = relative frequency obtained in a long sequence of tosses
- Symmetry or exchangeability argument:
probability = number of favorable cases/number of possibilities
The symmetry argument allow us to assign
probability to a singe experiment!!!
Explain Bayesian philosophy in a nutshell
Probabilities do not have to be based on the frequencies of outcome of
experiments (contrary to frequentists view-point)
- We may assign probabilities to e.g. parameters and models
- Probabilities express the degree of uncertainty about our knowledge of a
variable
What is Bayesian inference
Explain: prior, likelihood, posterior
Prior: Assumptions about θ before seeing the data.
Likelihood: The effect of the observed data x.
Posterior: The uncertainty in θ after observing x. (What is the parameters given the data we have seen)
Explain the three steps in bayesian method:
- Choose a prior density p(θ) on the parameters θ ∈ Ω.
- Choose a statistical model fθ(x) ≡ p(x|θ) that reflects our beliefs about x given θ.
- After observing data x, update our beliefs and calculate the posterior
distribution p(θ|x).
Note bayesrule says: the paramters given the data is aprox. the same as the likelihood given the prior
Option 1: Flat prior, p(θ) = 0.5 on θ ∈ [−1, 1] and zero otherwise.
Option 2: Jeffreys’ prior, p(θ) = 1
1−θ^2
Option 3: Triangular prior, p(θ) = 1 −
θ|.
Option 1 would not be good, it assumes all outcomes are just as likely
Option 2 is better because ?
Option 3:
Explain flat prior, jefferys prior triangular prior
What is uninformative prior distribution?
Given a convincing prior distribution, bayesian methods are the most satisfactory rather than frequentists.