week 8 formatted Flashcards

Question 1

Q

What is the goal of the Gibbs sampler algorithm?

Answer

A

To obtain a sample from a multivariate posterior distribution π(θ|x) where θ = (θ₁, ..., θd).

Question 2

Q

Describe the initialization step (step 1) of the Gibbs sampler.

Answer

A

Initialize the parameter vector with starting values: θ⁽⁰⁾ = (θ₁⁽⁰⁾, ..., θd⁽⁰⁾).

Question 3

Q

Describe the iterative sampling step (step 2) in the Gibbs sampler for θ = (θ₁, ..., θd).

Answer

A

For iteration i = 1 to m:\n- Simulate θ₁⁽ⁱ⁾ from the conditional π(θ₁|θ₂⁽ⁱ⁻¹⁾, ..., θd⁽ⁱ⁻¹⁾, x).\n- Simulate θ₂⁽ⁱ⁾ from π(θ₂|θ₁⁽ⁱ⁾, θ₃⁽ⁱ⁻¹⁾, ..., θd⁽ⁱ⁻¹⁾, x).\n- … Continue this process, simulating θj⁽ⁱ⁾ from π(θj | θ₁⁽ⁱ⁾, ..., θj-₁⁽ⁱ⁾, θj+₁⁽ⁱ⁻¹⁾, ..., θd⁽ⁱ⁻¹⁾, x).\n- … up to θd⁽ⁱ⁾ from π(θd | θ₁⁽ⁱ⁾, ..., θd-₁⁽ⁱ⁾, x).

Question 4

Q

What constitutes one ‘sweep’ or ‘scan’ of the Gibbs sampler?

Answer

A

Completing the simulation of all components θ₁⁽ⁱ⁾ through θd⁽ⁱ⁾ for a single iteration i.

Question 5

Q

What is the final output of the Gibbs sampler algorithm after m iterations?

Answer

A

A collection of samples (θ⁽¹⁾, θ⁽²⁾, ..., θ⁽ᵐ⁾), which represents draws from the target posterior distribution π(θ|x) after a suitable burn-in period.

Question 6

Q

When is the Gibbs sampler particularly useful?

Answer

A

When the full conditional distributions π(θj | θ-j, x) are easy to sample from (e.g., standard distributions), often due to conditional conjugacy.

Question 7

Q

What are the three stages typically defined in a Bayesian hierarchical model?

Answer

A

Stage I: Data model/likelihood (x_i | θ_i ~ f(x_i | θ_i)).\n* Stage II: Prior for parameters (θ_i | φ ~ π₀(θ_i | φ)).\n* Stage III: Hyperprior for hyperparameters (φ ~ π₀(φ)).

Question 8

Q

In a hierarchical model, what does it mean for the parameters θ = (θ₁, ..., θn) to be generated exchangeably?

Answer

A

They are assumed to be drawn from a common population distribution governed by a hyperparameter φ.

Question 9

Q

Write the proportionality relationship for the joint posterior distribution π(φ, θ | x) in a standard hierarchical model.

Answer

A

π(φ, θ | x) ∝ [Πi=1n f(x_i | θ_i) × π₀(θ_i | φ)] × π₀(φ).

Question 10

Q

What is the main idea behind using auxiliary variables in MCMC (like Gibbs sampling)?

Answer

A

To introduce additional variables U such that the joint distribution π(θ, u | x) has the target marginal π(θ|x), but the full conditionals π(θ | u, x) and π(u | θ, x) are easier to sample from.

Question 11

Q

What are two desired properties when introducing auxiliary variables U?

Answer

A

The full conditionals π(θ | u, x) and π(u | θ, x) are straightforward to sample.\n2. The introduction of U breaks complex dependence structures among the original variables θ.

Question 12

Q

In the context of a K-component finite mixture model fy(y|θ) = Σk=1K ωk fk(y|θk), what auxiliary variables U₁, ..., Un are introduced?

Answer

A

Discrete labels U_i indicating which component density (f₁, ..., fK) generated the i-th data point y_i.

Question 13

Q

What is the distribution of the auxiliary variable U_i in the mixture model example?

Answer

A

U_i follows a Categorical (or Multinomial(1, ω₁, ..., ωK)) distribution with P(U_i = k) = ωk for k = 1, ..., K.

Question 14

Q

In the mixture model Gibbs sampler with auxiliary variables u, the conditional posterior π(θ|u, y) factorizes into which two independent parts?

Answer

A

It factorizes into π(θ₁, ..., θK | u, y) and π(ω₁, ..., ωK | u, y).

Question 15

Q

If the prior on θk factorizes as Π π₀(θk), how is the conditional posterior π(θk | u, y) updated for component k?

Answer

A

It depends only on the data points y_i for which u_i = k. Specifically, π(θk | u, y) ∝ [Πi: u_i=k fk(yi | θk)] π₀(θk).

Question 16

Q

If the prior for the mixture weights ω = (ω₁, ..., ωK) is Dirichlet(α₁, ..., αK), what is the conditional posterior distribution π(ω | u, y)?

Answer

A

It is also a Dirichlet distribution: Dirichlet(n₁ + α₁, ..., nK + αK), where nk = Σi=1n 1{ui=k} is the count of data points assigned to component k.

Question 17

Q

What is the form of the conditional distribution π(u | θ, y) for the auxiliary variables in the mixture model?

Answer

A

It factorizes as π(u | θ, y) = Πi=1n π(u_i | θ, y_i).

Question 18

Q

How is the probability P(U_i = k | θ, y_i) calculated for the discrete auxiliary variable U_i in the mixture model?

Answer

A

Using Bayes’ theorem: P(U_i = k | θ, y_i) = [f(y_i | θk) * ωk] / [Σj=1K f(y_i | θj) * ωj]. It’s a discrete probability distribution over {1, ..., K}.