RL: Chapter 1: Introduction Flashcards

Question 1

Q

Reinforcement learning

Answer

A

Learning what to do - how to map situations to actions - so as to maximise a numerical reward signal.

The learner is not told which actions to take, but instead must discover which actions yield the most reward by trying them.

Question 2

Q

Main challenge in reinforcement learning vs other types

Answer

A

Exploration vs exploitation.

The agent has to exploit what it has already experienced in order to obtain reward.

But it also has to explore in order to make better action selections in the future.

Question 3

Q

6 Main subelements of a reinforcement learning system

Answer

A

Agent
Environment

>

Policy
Reward signal
Value function
A model of the environment

Question 4

Q

6 Main subelements of a reinforcement learning system

Policy

Answer

A

Defines the learning agent’s way of behaving at a given time.

Roughly, a policy is a mapping from perceived states of the environment to actions to be taken when in those states.

Question 5

Q

6 Main subelements of a reinforcement learning system

Reward signal

Answer

A

Defines the goal of a reinforcement learning problem.

On each time step, the environment sends to the reinforcement learning agent a single number called the rewards. The agent’s sole objective is to maximize the total reward it receives over the long run.

Question 6

Q

6 Main subelements of a reinforcement learning system

Value function

Answer

A

Whereas the reward signal indicates what is good in an immediate sense, a value function specifies what is good in the long run.

The value of a state is the total amount of reward an agent can expect to accumulate over the future, starting from that state.

RL: Chapter 1: Introduction Flashcards

(6 cards)