reinforcement learning Flashcards
1
Q
when you find an optimal policy, it has maximized:
A
the expected discounted reward
2
Q
what 3 things are needed for reinforcement learning?
A
1) a tranisition model -> how actions influence states
2) reward r -> immediate value of state-action transition
3) policy pi -> maps states to actions
3
Q
what is reinforcement learning?
A
the agent receives no examples and starts without a model of the enviroment or a utility function. The agent get feedback through rewards. The task is to learn how to succesfully behave in order to achieve a goal while interacting with an external environment, through trial and error.