Introduction to RL. Multiarmed bandits Flashcards by Carlos Sequeira

Reinforcement Learning is both a class of ___ and a class of ___

problems

algorithms

How well did you know this?

Not at all

Perfectly

Policy

Mapping between what the agent is seing and what the agent chooses to do

How well did you know this?

Not at all

Perfectly

Rewards

(immediate )nomerical single that provides the agent what good or bad actions are
Agent goal is to get as much reward as possible

How well did you know this?

Not at all

Perfectly

Value Functions

Long term functions of reward

We need to see if the agent lives that long on the long term

How well did you know this?

Not at all

Perfectly

Models

(of the problem/environment)

How well did you know this?

Not at all

Perfectly

State

represents the relevant information to solve the task

How well did you know this?

Not at all

Perfectly

Actions

what the agent can do

How well did you know this?

Not at all

Perfectly

Goal

Draws the behaviour of the agent (rewards)

How well did you know this?

Not at all

Perfectly

Dynamics

Describe how the actions of the agent influence the environment

How well did you know this?

Not at all

Perfectly

The agent does not know the ___ and the ___

Goal

Dynamics

How well did you know this?

Not at all

Perfectly

The agent showld interact with (or explore / exploit) the environment and figure out what the goal is and the dynamic is

…

How well did you know this?

Not at all

Perfectly