week 8 Flashcards

1
Q

What is the goal of the Gibbs sampler algorithm?

A

To obtain a sample from a multivariate posterior distribution π(θ|x) where θ = (θ₁, …, θd).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe the initialization step (step 1) of the Gibbs sampler.

A

Initialize the parameter vector with starting values: θ⁽⁰⁾ = (θ₁⁽⁰⁾, …, θd⁽⁰⁾).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the iterative sampling step (step 2) in the Gibbs sampler for θ = (θ₁, …, θd).

A

For iteration i = 1 to m: Simulate θ₁⁽ⁱ⁾ from the conditional π(θ₁|θ₂⁽ⁱ⁻¹⁾, …, θd⁽ⁱ⁻¹⁾, x). Then simulate θ₂⁽ⁱ⁾ from π(θ₂|θ₁⁽ⁱ⁾, θ₃⁽ⁱ⁻¹⁾, …, θd⁽ⁱ⁻¹⁾, x). Continue this process, simulating θj⁽ⁱ⁾ from π(θj | θ₁⁽ⁱ⁾, …, θ_{j-₁}⁽ⁱ⁾, θ_{j+₁}⁽ⁱ⁻¹⁾, …, θd⁽ⁱ⁻¹⁾, x), up to θd⁽ⁱ⁾ from π(θd | θ₁⁽ⁱ⁾, …, θ_{d-₁}⁽ⁱ⁾, x).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What constitutes one ‘sweep’ or ‘scan’ of the Gibbs sampler?

A

Completing the simulation of all components θ₁⁽ⁱ⁾ through θd⁽ⁱ⁾ for a single iteration i.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the final output of the Gibbs sampler algorithm after m iterations?

A

A collection of samples (θ⁽¹⁾, θ⁽²⁾, …, θ⁽ᵐ⁾), which represents draws from the target posterior distribution π(θ|x) after a suitable burn-in period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When is the Gibbs sampler particularly useful?

A

When the full conditional distributions π(θj | θ_{-j}, x) are easy to sample from, often due to conditional conjugacy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the three stages typically defined in a Bayesian hierarchical model?

A

Stage I: Data model/likelihood (x_i | θ_i ~ f(x_i | θ_i)). Stage II: Prior for parameters (θ_i | φ ~ π₀(θ_i | φ)). Stage III: Hyperprior for hyperparameters (φ ~ π₀(φ)).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In a hierarchical model, what does it mean for the parameters θ = (θ₁, …, θn) to be generated exchangeably?

A

They are assumed to be drawn from a common population distribution governed by a hyperparameter φ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Write the proportionality relationship for the joint posterior distribution π(φ, θ | x) in a standard hierarchical model.

A

π(φ, θ | x) ∝ [Π_{i=1}^n f(x_i | θ_i) × π₀(θ_i | φ)] × π₀(φ).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the main idea behind using auxiliary variables in MCMC (like Gibbs sampling)?

A

To introduce additional variables U such that the joint distribution π(θ, u | x) has the target marginal π(θ|x), but the full conditionals π(θ | u, x) and π(u | θ, x) are easier to sample from.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are two desired properties when introducing auxiliary variables U?

A
  1. The full conditionals π(θ | u, x) and π(u | θ, x) are straightforward to sample. 2. The introduction of U breaks complex dependence structures among the original variables θ.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In the context of a K-component finite mixture model fy(y|θ) = Σ_{k=1}^K ωk fk(y|θk), what auxiliary variables U₁, …, Un are introduced?

A

Discrete labels U_i indicating which component density (f₁, …, fK) generated the i-th data point y_i.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the distribution of the auxiliary variable U_i in the mixture model example?

A

U_i follows a categorical (or Multinomial(1, …)) distribution with P(U_i = k) = ωk for k = 1, …, K.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In the mixture model Gibbs sampler with auxiliary variables u, the conditional posterior π(θ|u, y) factorizes into which two independent parts?

A

It factorizes into π(θ₁, …, θK | u, y) and π(ω₁, …, ωK | u, y).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

If the prior on θk factorizes as Π π₀(θk), how is the conditional posterior π(θk | u, y) updated for component k?

A

It depends only on the data points y_i for which u_i = k. Specifically, π(θk | u, y) ∝ [Π_{i: u_i=k} fk(yi | θk)] π₀(θk).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

If the prior for the mixture weights ω = (ω₁, …, ωK) is Dirichlet(α₁, …, αK), what is the conditional posterior distribution π(ω | u, y)?

A

It is also a Dirichlet distribution: Dirichlet(n₁ + α₁, …, nK + αK), where nk = Σ_{i=1}^n 1{ui=k} is the count of data points assigned to component k.

17
Q

What is the form of the conditional distribution π(u | θ, y) for the auxiliary variables in the mixture model?

A

It factorizes as π(u | θ, y) = Π_{i=1}^n π(u_i | θ, y_i).

18
Q

How is the probability P(U_i = k | θ, y_i) calculated for the discrete auxiliary variable U_i in the mixture model?

A

Using Bayes’ theorem: P(U_i = k | θ, y_i) = [f(y_i | θk) * ωk] / [Σ_{j=1}^K f(y_i | θj) * ωj]. It’s a discrete probability distribution over {1, …, K}.