Chapter 4 Poisson Flashcards
How do we model an experiment where an outcome variable is binary
binomial likelihood
What model can be used to model an experiment of count data and describe its properties
Poisson model - sample space is (0,1,2….) and it is equidispersed
How does the rate of a poisson effect the distribution
It effects its shape - most of the probability mass with be around where the rate (mean) is
What does a sufficient statistic mean
All the information about theta that is available in the data can be contained in a sufficient statistic
What is the sufficient statistic of P(theta| y) if the data is poisson iid distributed
The sum of all the data or the sum of the yi’s is the sufficient statistic
What is the conjugate prior for the poisson model’s posterior distribution
Gamma distribution
The gamma family is conjugate for the poisson sampling model
How does the gamma distribution and the exponential distribution relate
for gamma A=1, B=1 we have the exponential distribution
How to examine the posterior probability that theta in poisson sampling model is greater than 1.5 in R
1-pgamme(1.5,a+sum of yis, b+n)
Posterior expectation can be expressed as a convex
combination of the prior expectation and the sample average - What interpreation does this give for a and b
b is interpreted as the number of “prior observations”.
a is interpreted as “the sum of counts from b prior observations”.
Posterior expectation can be expressed as a convex
combination of the prior expectation and the sample average - what does this mean for large n
there is
much more influence from the likelihood compared to the prior on the posterior mean.
When n is large, the information from the data dominates the prior
information. So we can assume that n»_space; a and n»_space; b. In this case: the expectation of posterior tends to Ybar - the sample mean
For very large values of N what does this mean for the variance of the posterior
The variance tends towards Ybar/ n
For the posterior predictive distribution when n is large what is significant about he expectation and variance
The variance will tend to the posterior expectation which is equal to the posterior predictive expectation. so for large n the expectation equals the variance and we are back to being poisson distributed.
If a question says to what extent do we expect that … what distribution are you talking about
The posterior predictive - events unfolding in the future
If there is a gap between Theta1>Theta2 and and y_tilde1>y_tilde2 is this surprising>
No, not really! It is important to make a distinction between the events
Theta1>Theta2 and y_tilde1>y_tilde2
Strong evidence of a difference between two populations does not mean that the difference itself is large.
What si the strategy to find the monte carlo approximation of the posterior predictive
Sample Theta from the posterior and using thiese theta values sample from the likelihood with that theta as your parameter
Using R how to find probability that theta_1>theta_2 if you have values for theta1 and theta 2 in a vector
mean(theta_1>theta_2)
Explain model checking
Important part of the Bayesian workflow - checking how well the model fits the data. Usually involved comparing the predictive distribution and data.
If the model is telling us something different to the data is this a discrepancy or not the right model
Key question in model checking
If If the observed data has this ratio, why should we be predicting otherwise?
WHta are two possible explanations for a discrepancy between the empirical distribution of the data and the predictive distribution
The empirical distribution of the data doesn’t not necessarily have to
match the distribution of population from which the data were sampled
from, eg, if the sample size is small.
Secondly - the large discrepancy could just be due to a feature of the population so the data simply reflects this feature and maybe the model cannot replicate that peak/ trough because of the nature of the distribution
Explain how to assess the discrepancy between the empirical and predictive
distributions using Monte Carlo model checking
Let statistic be a specific ratio t(y)
Find t(y observed data)
No make a replcaited sample from Y_tilde of the same size as the y observed data and record t(y_tilde)
This process can be repeated many many times to get many t(y_tildes) as a sample
We can then test: mean(t.mc>=t.obs) to see out of the replicated data sets which had values of t(y_tilde) that equalled or exceeded t(y_observed data)
If this is very small the model maybe flawed as predicts that we would
hardly ever see a dataset that resembled our observed one
What si model checking suggests the poisson model is inadequate to the true probability distribution of y_tilde
An alternative count model may perform better, or a simple poisson model may suffice if we are only interested in certain aspects of the probability distribution of y