markov models Flashcards

Question 1

Q

when we do not have control over the state transitions but the states are completely observable then we have a:

Question 2

Q

how do you solve a POMDP?

Answer

A

the model has to find, for any action/observation history, the action that maximizes the expected discounted reward. the model has to find, for any action/observation history, the action that maximizes the expected discounted reward:

Question 3

Q

a POMDP model contains: (3)

Answer

A

A state transition (probability) function: P(s_{t+1}|s_t, a_t) (probability of the next state, given the current state and the action)
An observation function: P(o_t|s_t,a_t) (probability of the observation, given the current state and the action)
A reward function: E(r_t|s_t,a_t) (expected reward, given the current state and the action)

Question 4

Q

Partially Observable Markov Decision Processes are

Answer

A

MDPs that have partially observable states instead of the agent knowing exactly what the current state is.

Question 5

Q

what is the difference between POMDPs and MPDs?

Answer

A

in MDPs, an agent knows exactly what the current state is.

Question 6

Q

what are 2 methods to find the optimal policy of an MDP

Answer

A

value iteration

policy iteration

Question 7

Q

what is the utility in a MPD?

Answer

A

the utility of a state is the expected sum of discounted rewards if the agent executes the policy pi. The true utility of a state corresponds to the optimal policy pi*.

Question 8

Q

what is a markov decision process?

Answer

A

MPD’s are functions to choose actions, given the state of the world which are generally called policies. They associate an optimal decision with every state that the agent might reach in uncertain environments.

Question 9

Q

why / when are HMM’s used?

Answer

A

because an experimenter can not always observe the states, only measure them by the observable outcome.

Question 10

Q

how is a hidden markov model different form a markov model?

Answer

A

hidden markov models do not have observations given but rather represent the states by the probability distribution of several possible observations that can be expected to occur in this state.: the probability that we observe k given we’re in state i.

Question 11

Q

how do markov chains work?

Answer

A

1) starts in a state decided by starting distribution probilities.
2) visit states based on the probability to go from one state to the next
3) is a sttochastic process

Question 12

Q

when we have control over the state transitions but the states aren’t completely observable then we have a:

Answer

A

partially observable markov decision process: POMDP

Question 13

Q

when we do not have control over the state transitions and the states are not completely observable then we have a:

Answer

A

Hidden markov model (HMM)

Question 14

Q

when we do have control over the state transitions but the states are completely observable then we have a:

Answer

A

markov decision process (MDP)

Question 15

Q

when we do not have control over the state transitions but the states are completely observable then we have a:

Answer

A

markov chain

Question 16

Q

when the observations are given, the Markov model is also called a:

Answer

Study These Flashcards

A

markov chain model

Question 17

Q

what is the markov property?

Answer

Study These Flashcards

A

assuming that future states depend only on the current state, not on the events that occurred before it.

Question 18

Q

what is a markov model?

Answer

Study These Flashcards

A

A Markov model has observable states of the world, and an observed sequence and it assumes the Markov property.

Question 19

Q

where are markov models applied?

Answer

Study These Flashcards

A

in areas where there is recognition of temporal patterns with variable duration, such as speech and handwriting recognition. This has the problem that the patterns can be very short- or very long- lasting.

markov models Flashcards

(19 cards)