Dynamic probabilistic models Flashcards

1
Q

Describe how we view the world when using these models

A

We view the world in snapshots or time slices

Each time slice contains a set of random variables

Some of these are hidden and some are observable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we denote the state variables at time t?

A

These unobservable variables are denoted as:

Xt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do we denote the observable evidence variables at time t?

A

Et

where the observationa at time t is:

Et = et for some values et

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the example we use to understand the situation of rain?

A

You are a security guard in an underground location.

You want to know if it is raining today, you can also know that by seeing if someone brings an umbrella or not

For each day t, the set Et contains a single evidence variable,

Umbrellat / Ut

The set Xt contains a single state variable

Raint / Rt

We then assume that the time betweens lices are fixed

and that the evicence starts arriving at t = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between a first order markov process and a second order?

A

See picture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain the transition model

A

The transition model specifies the probability distribution over the lastest state variables, given previous values.

Can be written as:

P(Xt | X0:t-1)

This would become increasingly large, but with the markov assumption we can write following for first order markov processes:

P(Xt | X0:t-1) = P(Xt | Xt-1)

Extra info:

for second order markov processes it would be:

P(Xt | X0:t-1) = P(Xt |Xt-2, Xt-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the sensor model

A

See figure

The evidence E could depend on previous variables as well as the current state variable X.

We can make the following sensor markov assumptions:

P(Et | X0:t, E0:t-1) = P(Et | Xt)

So the sensor model will therefore be:

P(Et | Xt)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why does the arrows go from X to E?

A

The arrows go from the actual state of the world to the sensor values because the rain causes the umbrella to appear.

Bonus info:

For inference this goes the other direction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the requirements for the markov process?

A

Prior probability distribution

transition model

sensor model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the basic inference tasks for a

markov process

/

dynamic models?

A

Filtering

prediction

smoothing

most likely estimation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is filtering?

A

Compute the belief state – the posterior distribution of the current state – given the evidence so far, i.e., P(Xt | e1:t). For example, P(Raint | umbrella1:t).

In our example this would mean computing the probability of rain today, given all the observations of the umbrella carrier made so far.

filtering is what a rational agent does to keep track of the current state, so that rational decisions can be made.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is prediction?

A

This is the task of computing a posterior (fremtidig!) distribution over the future state given all the evidence that we have so far.

So we wish to compute P(Xt+k | e1:t) for some k > 0

In the umbrella example, this might mean computing the probability of rain three days from now, given all evidence to date.

Prediction is usefull for evaluating possible courses of action based on their expected outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is smoothing?

A

This is the task of computing the posterior (fremtidig!) distribution over a past state given all the evidence so far. So we might compute:

P(Xk | e1:t)

for some k such that

0 <= k <= t.

In other words: computing the probability that it rained wednesday, given all the observations of the umbrella carier made up to today.

why?

Smoothing provides a better estimate of the state than what was available at that time, because it incorporates more evidence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is most likely explanation?

A

Given a sequence of observations, we might wish to find the sequence of states that is most likely to have generated those observations. That is, we wish to compute

argmax<strong>x</strong>1:t( P(x1:t | e1:t)

For example, if the umbrella appears on each of the first three days and is not there the fourth, then the most likely explanation is that it rained on the three days and not on the fourth.

Bonus info:

this is also usefull in the field of speech recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do we perform filtering?

A

See figure

https://gyazo.com/65025743759cbf9429cd9462db282e68

  1. get prior probability

https://gyazo.com/1f225f8079f051f1f0917f4cdcb54bc7

  1. predict R1

https://gyazo.com/14b249e5bcc573c5e250ea4485aab559

  1. Update by multiplying this with the probability of evidence and normalise

https://gyazo.com/19e8c6a4e2d30f3bb029a27390b6c5e7

we normalize <0.45, 0.1> by saying:

<0.45 / (0.45 + 0.1) , 0.1 / (0.45 + 0.1)

For day two we now have the last update and use that instead of the prior probability.

https://gyazo.com/a1fdc4441bbf996e8ab37e18969732af

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do we do prediction?

A

The task of prediction can be seen as filtering without the addition of new evidence.

https://gyazo.com/6c6a75dd423882ee2e2ca310371b1738

Bonus info: when k becomes very large (see out in the future) the distribution for rain converges to 0,5/0.5 again.

17
Q

What are the three components for filtering?

A

https://gyazo.com/75c4febb3d73e60694d253fb8df65860

Sensor model

transition model

last update

18
Q

What are the components of the update function?

A

Transition model

Last update

19
Q

What are the components of the smoothing function?

A

backwards message

forward message

https://gyazo.com/e03237a9d30d6297c1915d5a96f41c01

20
Q

How do we perform smoothing?

A

Recursive process: follow the examples here:

Step one:

https://gyazo.com/3d045798aa7dd7c1253dd1f915c55d7c

Step two initial backwards message:

https://gyazo.com/b155cadbfcc0287465d89bfb51f4a7aa

Step 3: calculate probability given forward and backward
https://gyazo.com/9ac4f440c90a1c8e6d6e3714e98378ef

Step 4: calculate backward for b2:2

https://gyazo.com/0855da7a239fc43580d298b0cbc2a59c

Step 5 (last): calculate probability for R1

https://gyazo.com/b57baf95a2ebb1a008cd0c8e5c4ada22

21
Q

What are some extra observations about smoothing?

A

The forward and backward recursions take constant time per step (for a single k) O(t).

How can we make the algorithm run in time O(t) for all k? Simply cache the results of the forward procedure before starting with the backwards procedure

The algorithm is also called the Forward-backward algorithm.

22
Q

Why are most likely explanation usefull

A

If we see a sequence [true, true, false, true, true]

then maybe it did not rain the third day, but the courier just took his umbrella this day for security sake.

23
Q

Explain the MLE algorithm:

A

By viewing all possible states for the sequence, we can do as done in the figure:

https://gyazo.com/1a9307e21f846b037b3d96dff2346ef4

This is the Viterbi MLE algorithm that we use to calculate it.

24
Q

What is some last comments on the MLE algorithm?

A

For long observation sequences we are likely to run into numerical instability problems. Why?

Solution: Use logarithms of the probabilities and exchange multiplications with sums.