Reinforcement Learning Flashcards

Question 1

Q

Define reward in RL

Answer

A

The numerical signal (scalar value) that implicitly expresses the agent goal by encour- aging/punishing goal-directed/unwanted state transitions. (2) C06 S 25

Question 2

Q

Define action-value function in RL

Answer

A

The action-value function q describes the expected cumulative and discounted reward following a specific policy when selecting a specific action in a particular state. (2) C06 S 37

Question 3

Q

Define approximate RL

Answer

A

The agent predicts values (or actions in the case of policy gradient) with the help of non-linear function approximators (like neural networks) that generalize on states. (2) C06 S 61

Question 4

Q

Define the state-value function in RL

Answer

A

The state-value function is the expected return when a specific policy is followed after
visiting a particular state:

Question 5

Q

Define the action-value function in RL

Answer

A

The action-value function q is the expected return when a specific policy is followed after choosing an action in a particular state.

Question 6

Q

What is the reward hypothesis in RL

Answer

A

That all of what we mean by goals and purposes can be well thought of as the maximization
of the expected value of the cumulative sum of a received scalar signal (called reward)

Reinforcement Learning Flashcards

(6 cards)