CS7642_Readings Flashcards
Any finite length discrete time MDP can be converted to a TTD-MDP? (True/False)
True. This is from the TTD MDP paper for week 10
It is always possible to find a stochastic policy that always solves a TTD-MDP?
False. See Bhat 2007
In policy shaping, probably the most important parameter is gamma?
False. The most important parameter for learning in policy shaping is the probability that an evaluation of an action choice is correct for a given learner (denoted by ‘C’)
An Oracle is always better than human teaching agents in policy shaping?
False. Cederbourg 2011 says this is not the case
Like MDPs, every Markov game has a non-empty set of optimal policies, at least one of which is stationary? (True/False)
True. From Littman 1994
There is always at least one optimal stationary deterministic policy for Markov games? (True/False)
False. Only regular MDPs make this guarantee.