Markov Decision Making Flashcards
1
Q
Deterministic Dynamic Programming (DDP)
A
for any node, the next node is fully determined by action K
2
Q
Stages
A
vertical levels
e.g. time
3
Q
States
A
horizontal levels
e.g. stock levels, success
4
Q
States at a stage
A
nodes
5
Q
Actions
A
arcs between nodes from consecutive stages
6
Q
Return
A
immediate gains from an action (associated with an arc)
7
Q
Stochastic Dynamic Programming (SDP)
A
for any node, the next node is dependant on both the action K and the outcome d, which is a random variable