Reinforcement Learning Flashcards

1
Q

Define reward in RL

A

The numerical signal (scalar value) that implicitly expresses the agent goal by encour- aging/punishing goal-directed/unwanted state transitions. (2) C06 S 25

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define action-value function in RL

A

The action-value function q describes the expected cumulative and discounted reward following a specific policy when selecting a specific action in a particular state. (2) C06 S 37

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define approximate RL

A

The agent predicts values (or actions in the case of policy gradient) with the help of non-linear function approximators (like neural networks) that generalize on states. (2) C06 S 61

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define the state-value function in RL

A

The state-value function is the expected return when a specific policy is followed after
visiting a particular state:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define the action-value function in RL

A

The action-value function q is the expected return when a specific policy is followed after choosing an action in a particular state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the reward hypothesis in RL

A

That all of what we mean by goals and purposes can be well thought of as the maximization
of the expected value of the cumulative sum of a received scalar signal (called reward)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly