L7: Markov Chains, Monte Carlo Estimation Flashcards

1
Q

Equation for conditional probability?

A

p(X1 | X2) = p(X1 and X2)/p(X2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name some dynamical systems with uncertainty

A
  • Natural language analysis
  • Mathematical finance
  • Epidemiology
  • Robot and autonomous vehicle control
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Definition of a stochastic process

A

Sequence of RVs. The range (possible values) of the random variables in a stochastic process is called the state space of the process. Index n represents the time index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can we interpret a stochastic process as a BN

A

We can interpret a stochastic process as a Bayesian network (with infinitely many vertices)

Need initial distribution p(X1)
Conditional distributions CPTs (no independence assumptions unless given in basic BNs)- describe what happens next after a specific history up to time k

Number of parameters therefore grows exponentially with time, not workable (we assume an infinite number)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When were Markov chains introduced and by who?

A

Andrey Marvok in 1906

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Markov property?

A

Assumes that X_k+1 is only dependent on X_k (independent of all X_k-n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A Markov Chain is an example of what (based on Markov property)?

A

Directed Acyclic Graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the conditional PMFs in a Markov Chain called?

A

The transition probabilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How many parameters are needed to model a Markov chain?

A

Grows linearly with n (but is not linear! more like n^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the meaning of the time homogeneity assumption?

A

Also called stationarity. Assumption is that p(X_k+1 | X_k) = p(X_2 | X_1). Therefore the transition probability does not depend on time n. Then the number of required parameters is constant with regards to the time index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Can you do the question on slide 25?

A

Yes?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the limit distribution?

A

Marginal probability p(X_infinity) lim k tends to infinity p(X_k)

We consider this the ‘typical’ state of the system after burn-in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What properties does a continuous random variable have?

A

No jumps between values. If there is an integrable PDF for the RV. Measured over an interval.

A continuous random variable can be defined as a random variable that can take on an infinite number of possible values. Due to this, the probability that a continuous random variable will take on an exact value is 0. The cumulative distribution function and the probability density function are used to describe the characteristics of a continuous random variable.

The PDF = 1 when integrated over all X

Push forward function integrates the PDF over the ranges we’re interested in

PDF IS NOT A PROBABILITY (PMF is but not PDF, need to integrate it for prob)

Note: there are real valued RVs with no PDFs and if exists often not unique

Note: Any function that maps R to [0,inf] and =1 when fully integrated is the PDF for some continuous RV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Difference between PMF and PDF?

A

PMF = probability mass function (discrete) - already probability, PDF = probability density function (continuous)- needs to be integrated over interval to become probability

Refer to them as distribution functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is CDF?

A

For both continuous and discrete, = prob that x takes value equal to or below value (stairs for discrete, increasing curve for continuous)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the expected value?

A

The probability weighted average of the possible outcomes of X

17
Q

Why is calculating the expected value of an RV often infeasible?

A
  • Distribution function pX may not exist
  • Distribution function pX may exist but be hard to compute
  • In general, problem may involve high-dimensional sums or integrals
18
Q

What is Monte Carlo estimation?

A

Process of approximating expected value of RV by sampling distribution.

Given a sequence of k independent sample the expected value = 1/k (sum of k samples). This will tend towards the real value the more samples you take.

Requires being able to sample from distribution

Convergence analysis is generally O(sqrt(k)) where k = num of samples

19
Q

How do we sample from random variables?

A
  • Find inverse CDF/ quantile function (this must be knowable), sample from there uniformly at random between [0,1]

If multiple values have same probability in CDF then take the lowest one (infinum)

Often X will have a distribution with a known quantile function (i.e. * Example: exponential family distributions (Gaussian, Gamma, Poisson, etc.).
BUT NOT ALWAYS

20
Q

In what situations do we need to compute expected values?

A

Inference with probabilistic AI methods tends to involve computing E[f (X)]:
* Moment statistics (e.g. mean, variance)
* Loss functions (e.g. model fitting, decision problems)
* Computing probability PX (A) of event A.