F7 Intro to Bayesian data analysis Flashcards

1
Q

What are the three steps in a Bayesian approach to forecasting?

A

1) Use fundamentals to predict the popular vote

2) Add in new data from polling

3) Update prior using Markov Chain Monte Carlo to a posterior prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Markov Chain Monte Carlo?

A

MCMC.

Explores thousands of different values for each parameter in our model, and evaluates both how well they explain the patterns in the data and how plausible they are given the expectations from our prior

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are two considerations for the prior?

A

1) Avoid overfitting - parsimonious model are rewarded.

2) Leave-one-out cross-validation: Training models on data from some elections from the data and testing their performance on others.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What can be said about the 10,000 simulations from the model?

A

Hypothetical paths the election could take (Trump, Harris or tied).

The more likely a scenario, the more often it will appear.

Some of them involve large nationwide, regional, or demographic polling errors benefiting one party or another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is the probability of Harris winning calculated?

A

Simply the fraction of simulations where Harris win.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Bayesian logic (short)? Draw it

A

We update our prior believe (fundamentals) with new data (polls distributed by covariance matrix) resulting in a posterior distribution of outcomes sample from a posterior distribution (calculate probability of Harris winning).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are two fundamental components of Bayesian data analysis?

A

We reallocate credibility across all possible outcomes (Harris, Trump or tied).

The possible outcomes over which we allocate credibility are parameter values in meaningful mathematical models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does ‘Data is noisy’ mean?

A

Data have a probabilistic rather than deterministic relation to their underlying cause.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are five key steps in a Bayesian analysis?

A
  1. Identify data
  2. Define descriptive model (mathematical form and its parameters)
  3. Specify a prior distribution on the parameters
  4. Use Bayesian inference to reallocate credibility across parameter values
  5. Evaluate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is parameter values?

A

Control knobs on mathematical formulas that determine shape of the distribution e.g. location and scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the sample space for the election?

A

Three outcomes that are mutually exclusive:
Harris
Trump
Tied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between frequentist and Bayesian statistics?

A

Frequentist: Empirical and objective. With greater sample size, the result convereges to the ‘true’ underlying value.

Bayesian: There is only one election and it can’t be repeated. Bayesian is more about subjective believes about the likelihood of an event occurring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Bayes rule

A

Posterior = Likelihood * prior / evidence.

Probability of harris winning = number of simulations where Harris win given the likelihood of polls + prior / number of simulations

P (A|B) = P(B|A)*P(A) / P(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are two key take aways from the relation between prior uncertainty and number of polls?

A

1) The more certain the prior, the less impact from polls

2) The more polls, the less weight of prior

How well did you know this?
1
Not at all
2
3
4
5
Perfectly