L7: Markov Chains, Monte Carlo Estimation Flashcards

Question 1

Q

Equation for conditional probability?

Answer

A

p(X1 | X2) = p(X1 and X2)/p(X2)

Question 2

Q

Name some dynamical systems with uncertainty

Answer

A

Natural language analysis
Mathematical finance
Epidemiology
Robot and autonomous vehicle control

Question 3

Q

Definition of a stochastic process

Answer

A

Sequence of RVs. The range (possible values) of the random variables in a stochastic process is called the state space of the process. Index n represents the time index.

Question 4

Q

How can we interpret a stochastic process as a BN

Answer

A

We can interpret a stochastic process as a Bayesian network (with infinitely many vertices)

Need initial distribution p(X1)
Conditional distributions CPTs (no independence assumptions unless given in basic BNs)- describe what happens next after a specific history up to time k

Number of parameters therefore grows exponentially with time, not workable (we assume an infinite number)

Question 5

Q

When were Markov chains introduced and by who?

Answer

A

Andrey Marvok in 1906

Question 6

Q

What is the Markov property?

Answer

A

Assumes that X_k+1 is only dependent on X_k (independent of all X_k-n)

Question 7

Q

A Markov Chain is an example of what (based on Markov property)?

Answer

A

Directed Acyclic Graph

Question 8

Q

What are the conditional PMFs in a Markov Chain called?

Answer

A

The transition probabilities

Question 9

Q

How many parameters are needed to model a Markov chain?

Answer

A

Grows linearly with n (but is not linear! more like n^2)

Question 10

Q

What is the meaning of the time homogeneity assumption?

Answer

A

Also called stationarity. Assumption is that p(X_k+1 | X_k) = p(X_2 | X_1). Therefore the transition probability does not depend on time n. Then the number of required parameters is constant with regards to the time index.

Question 11

Q

Can you do the question on slide 25?

Question 12

Q

What is the limit distribution?

Answer

A

Marginal probability p(X_infinity) lim k tends to infinity p(X_k)

We consider this the ‘typical’ state of the system after burn-in

Question 13

Q

What properties does a continuous random variable have?

Answer

A

No jumps between values. If there is an integrable PDF for the RV. Measured over an interval.

A continuous random variable can be defined as a random variable that can take on an infinite number of possible values. Due to this, the probability that a continuous random variable will take on an exact value is 0. The cumulative distribution function and the probability density function are used to describe the characteristics of a continuous random variable.

The PDF = 1 when integrated over all X

Push forward function integrates the PDF over the ranges we’re interested in

PDF IS NOT A PROBABILITY (PMF is but not PDF, need to integrate it for prob)

Note: there are real valued RVs with no PDFs and if exists often not unique

Note: Any function that maps R to [0,inf] and =1 when fully integrated is the PDF for some continuous RV

Question 14

Q

Difference between PMF and PDF?

Answer

A

PMF = probability mass function (discrete) - already probability, PDF = probability density function (continuous)- needs to be integrated over interval to become probability

Refer to them as distribution functions

Question 15

Q

What is CDF?

Answer

A

For both continuous and discrete, = prob that x takes value equal to or below value (stairs for discrete, increasing curve for continuous)

Question 16

Q

What is the expected value?

Answer

Study These Flashcards

A

The probability weighted average of the possible outcomes of X

Question 17

Q

Why is calculating the expected value of an RV often infeasible?

Answer

Study These Flashcards

A

Distribution function pX may not exist
Distribution function pX may exist but be hard to compute
In general, problem may involve high-dimensional sums or integrals

Question 18

Q

What is Monte Carlo estimation?

Answer

Study These Flashcards

A

Process of approximating expected value of RV by sampling distribution.

Given a sequence of k independent sample the expected value = 1/k (sum of k samples). This will tend towards the real value the more samples you take.

Requires being able to sample from distribution

Convergence analysis is generally O(sqrt(k)) where k = num of samples

Question 19

Q

How do we sample from random variables?

Answer

Study These Flashcards

A

Find inverse CDF/ quantile function (this must be knowable), sample from there uniformly at random between [0,1]

If multiple values have same probability in CDF then take the lowest one (infinum)

Often X will have a distribution with a known quantile function (i.e. * Example: exponential family distributions (Gaussian, Gamma, Poisson, etc.).
BUT NOT ALWAYS

Question 20

Q

In what situations do we need to compute expected values?

Answer

Study These Flashcards

A

Inference with probabilistic AI methods tends to involve computing E[f (X)]:
* Moment statistics (e.g. mean, variance)
* Loss functions (e.g. model fitting, decision problems)
* Computing probability PX (A) of event A.

L7: Markov Chains, Monte Carlo Estimation Flashcards

(20 cards)