L5 part 2 - CT about Bayesian statistical inference Flashcards
What is the Baysian analysis about?
Updating our beliefs that are in accordance with rules of probability
Applying the conditional formula and use it to combine prior beliefs with new evidence about the world
Picture 1
Bayesian inference is reallocation of credibility across possibilities. (this becomes clear over the next few flashcards)
What is possibility and credibility?
Possibilities - the range of potential outcomes (or hypotheses)
Credibility - the degree of belief assigned to these possibilities
E.g. (picture 2) in cluedo there are 3 possible murderers (possibilities) and the credibility is how likely each of these possibilities are to be the murderer (the height of the card)
How does the credibility change with new data being gathered?
In Bayesian analysis, we’re reallocating the credibility (degree of belief) based on new evidence that we obtained (found the murder weapon with Rosa’s fingerprints - so Rosa is now more likely to be the murderer than the others)
picture 3
The allocation of the degree of belief changes over the course of gathering data
How do we get from prior beliefs to posterior beliefs?
We apply the Bayes theorem
Picture 4
What is the formula of the Bayes theorem? And what does each part mean?
D = data; θ = parameter/hypothesis
P(θ|D) = (P(D|θ) * P(θ))/P(D)
P(θ) - Prior beliefs
P(D|θ) - The likelihood
P(D) = Marginal likelihood
P(θ|D) = Posterier beliefs
Which type of probability do we talk about in Bayesian stats?
From now on, I’m not gonna be writing in Bayesian stats, it always refer to it unless stated otherwise
Subjective probabilities - the degree to which we believe something to be true
What is the prior? How does it compare to frequentist statistics?
P(θ)
Our belief in a hypothesis (or parameter) before considering the data
In frequentist statistics this is what is missing and because the prior is a subjective probability, we now have the base rate of the hypothesis so now we can invert the conditional probabilities
What is the prior distribution?
The prior probability is a distribution of possibilities so prior distribution represnts plausability (credibility) of values
- picture 8
What are the two types of prior distribution?
- Informative - narrows down our beliefs
↪ the peak indicates the value that we deem to be the most likely one and the tails represent values with low probability - Uninformative - believe all possible outcomes with equal probability (you basically have no idea or don’t want to risk betting on one specific value)
What is the likelihood?
P(D|θ)
It’s the probability that the observed data could be produced given the hypothesis or model being considered
E.g. let’s say we find a bloody bow tie. How likely am I to find a bloody bow tie given the different possibilities of murderers? Professor pimple wears a bow tie everyday and the other two don’t. If Professor Pimple was the murderer, it would be more likely for us to find a bloody bow tie than if any of the others were the murderer. So the possibility of Professor being the murderer gains more credibility
Picture 5
What is the likelihood function?
It gives the distribution of the likelihood over the full range of possible parameter values
What is the posterior probability?
P(θ|D) - probability of our hypothesis given the data
It’s our belief in a hypothesis (or parameter) after we have considered the data
E.g. We started our with our prior belief that all of the possibilities (suspects) are equally likely to be the murderer. We collected data (bloody bow tie) and we looked at how predictive are these different models (hypotheses/suspects) of finding that data. We use this evidence to update our prior beliefs and get our posterior beliefs (displayed in a posterior distribution of the credibility of each possibility)
picture 7
How does our posterior distribution change depending on our prior?
- if prior is uninformative - posterior heavily influenced by the data (data has a lot of scope to shape our posterior)
- If the prior is strongly informative then the data will influence the posterior less (the data are working agains our prior or will reinforce the prior even more if the data are consistent with our prior)
What relationship do posterior and prior * likelihood have?
P(θ|D) is proportional to P(D|θ) * P(θ)
So my degree of belief that professor Pimple is the killer, given that we found a bloody bowtie, is proportional to the probability of finding a bloody bowtie given that Pimple is the killer and my a priori degree of belief that Pimple is the killer
What is the marginal likelihood?
P(D)
It’s the probability of the observed data, i.e. evidence, irrespective of what the hypothesis is Picture 6
- So we take into consideration any hypothesis there could be
- We look at the ration of how much more predictive of the data is the assumption of the specific hypothesis (θ) over any other assumptions that we could make? picture 9
- This gives is the bayes factor