Chapter 3 Monte Carlo Approximation Flashcards
Explain the concept of monte carlo approximation
Monte Carlo experimentation is the use of simulated random numbers to estimate some functions of a probability distribution. It allows for approximating quantities of interest that are hard to calculate
How does the monte carlo method approximate the posterior
We could sample S independent theta values from the posterior distribution. Then the empirical distribution fo the sample we take of s values will approximate the posterior the larger the sample size the more accurate the approximation.
What to be cautious of when dealing with tail probabilities estimated by monte carlo inference
If events are very very rare alot fo data may eb needed to make the probability accurate as often tail probabilities can be small.
What theorem/ result allows to us to approximate quantities of interest from monte carlo inference as about equal to the actual posterior calculated values
The law of large numbers
What effect does the sample size have on monte carlo inference
Bigger the sample size the more accurate the approximation - gets closer to the true value
How do we estimate credible intervals using Monte Carlo simulation
Using rbeta() and the quantile function
What are the parameters of the quantile function
quantile(sample, c(q1,q2))
How can the exact solution of a beta credible interval be found?
beta_interval(% of interval, c(a,b),color=crcblue)
Or
qbeta(quantile, a, b)
What is the interpretation of the posterior odds for beta posterior
For beta distribution posterior: If o>1 then theta> 0.5. If o<1 then theta<0.5
Why is the posterior odds much easier to calculate using monte carlo inference
The odds p(o|y) is a very complicated distribution and formula so calculation is tricky.
How do we find p(o|y)?
Sub theta in terms of the odds into the posterior distribution p(theta | y). Theta = o/1+o and sub this into the posterior and multiply by dtheta/do
How to draw independent samples from p(o|y)? - the posterior odds size 1000
Betasamples
If the variable odds - is sample from p(o|y) how do you investigate the posterior probability that the odds is less than 1 in monte carlo inference
mean(odds<1)
If the variable odds - is sample from p(o|y) how do you investigate the posterior mean
mean(odds)
If the variable odds - is sample from p(o|y) how do you investigate the posterior credible interval 95%
quantile(odds, c(0.025,0.975))
What is the actual distribution of the predictive posterior distribution for beta prior, binomial likelihood and beta posterior and why is normally using monte carlo
Ytheta | Y=y is Beta-Binomial(m, a+y,b+n-y)
This is a complex distribution and monte carlo approximation is mostly easier
What is the r command to use to find the beta-binomial predictive posterior distribution
prob
What is the r command to use to find numbers that give a range of values in the the beta-binomial predictive posterior distribution that cover a certain probability mass
discint(pred_distribution, 0.9)
or
discint(pred_distribution, 0.8)
etc
How do we simulate the posterior predictive distribution in R?
Draw theta samples from the beta posterior and draw from the likelihood binomial using the theta posterior sample as the parameters
explain model checking
Does the model fit the data well. Key idea : the observed data should eb similar in some sense to the data predicted form the bayesian model.
What do we assume in our bayesian prediction by monte carlo method to carry out model checking
Sample theta from Beta(a+y,b+n-y) distribution and then use this theta in sample of y_tilde from binomial(m,theta). We require m=n to see if the observed data is similar to the predicted simulated data of the same size.
What conclusion should be drawn from model checking
If observed data y is different to the samples of y_tilde ‘s form the posterior predictive distribution we have evidence the Bayesian model is not a good fit for the data.
Discuss comparing a histogram for the purpose of model checking
Comparing a histogram of the samples of y_tilde our observed y value should sit in the middle of the graph. If it is it is consistent with the simulations of the replicate data from the predictive distribution.
Discuss the use of tail probabilities to comment on model checking
If P(y>y_tilde | y) or 1-P(y>y_tilde | y) are very small it suggests the model does not describe y very well.
What is the basic need to be able to use monte carlo
Need to be able to sample independently from the posterior distribution