Bayesian Statistics Flashcards
Give two reasons for why statistics is needed
- To make inferences on the population
- Assessing the uncertainty of statements
What is conditional probability
Chance of one event occurring given that another event has occurred
What does P(0) signify? Give a term for this
The plausibility of a certain value of 0 before seeing the statistics ( prior beliefs)
What does P(data|0)/P(data) signify?Give a term for this
How well did this value of 0 predict the data, compared to all other values of theta? (predictive updating factor)
What does P(0|data) signify? Give a term for this
The plausibility of 0 being equal to a particular value after seeing the data (Posterior beliefs)
Give 3 characteristics regarding beta distributions (range, shape, when values= 1)
- Ranges from 1-0
- Shape is determined by two values a and b
- if a and b equal 1 it is uniform
What does a flat prior distribution reflect? (a=1 b=1)
A prior distribution that reflects the
belief that all values of the proportion
are equally plausible, a priori, we call
this an uninformative prior
What does a normally distributed prior distribution reflect? (a=5 b=5)
A prior distribution that reflects the belief that values close to 0.5 are more plausible (i.e., there are equal number of left/right-handed people), a priori
What does a right skewed prior distribution reflect? (a=2 b=6)
A prior distribution that reflects the belief that values below 0.5 are more plausible (i.e., there are more right-handed people), a priori
What statistic is used in this situation?
The observed proportion
How do we obtain the x% central credible interval?
We take x% of the most central posterior mass and see which 2 points are the thresholds
What three flaws are presented in Bayesian Statistics?
- p-values measured against a sample (fixed size) statistic with some stopping intention changes with change in intention and sample size. i.e If two persons work on the same data and have different stopping intention, they may get two different p- values for the same data, which is undesirable.
2- Confidence Interval (C.I) like p-value depends heavily on the sample size. This makes the stopping potential absolutely absurd since no matter how many persons perform the tests on the same data, the results should be consistent.
3- Confidence Intervals (C.I) are not probability distributions therefore they do not provide the most probable value for a parameter and the most probable values.
What are parameters?
factors in the models affecting the observed data
What is meant by the term models
Models are the mathematical formulation of the observed events
What is signified by P(data)?
P(D) is the evidence. This is the probability of data as determined by summing (or integrating) across all possible values of θ, weighted by how strongly we believe in those particular values of θ
What two models contribute to the posterior belief? P(0/D)
The prior belief P(0) and the likelihood function P(D|θ)
What mathematical function is used to represent the prior beliefs?
Beta distribution
What is denoted by p(h1)/p(Ho)
Prior beliefs about hypothesis