Reinforcement Learning Flashcards

Question 1

Q

What is the formal definition of an agent?

Answer

A

An entity that has a set of sensors to observe the state of its environment, and a set of actions it can perform to alter the state

Question 2

Q

What is the task of an agent?

Answer

A

To learn a control strategy (policy) for choosing actions to achieve its goals

Question 3

Q

How do we provide reinforcement to an agent?

Answer

A

By rewarding it with a positive score for actions taken towards reaching the goal, and negative score for actions away from the goal

Question 4

Q

Why can sparse reward spaces result in a failure of reinforcement learning?

Answer

A

Number of steps to gain reward too high
Random choice is computationally inefficient

Question 5

Q

What is reward shaping?

Answer

A

Manually designing a reward function.

This guides the policy to the final goal

Question 6

Q

What are some drawbacks of reward shaping?

Answer

A

Ad-hoc process depending on environment
New reward function for every problem
Agent may learn to maximise reward without achieving goal

Question 7

Q

What is the law of unintended concequences?

Answer

A

Unexpected benefit
Unexpected drawbacks
Perverse result

Question 8

Q

What should an agent learn to do?

Answer

A

It should learn to choose actions that maximise the reward gained from that action

Question 9

Q

What is a utility based agent?

Answer

A

It learns a utility function on states and uses it to select actions that maximise the expected outcome utility

Question 10

Q

What do utility based agents need?

Answer

A

A model of the environment

Question 11

Q

What is Q-Learning?

Answer

A

An agent that learns an action-utility function given the expected utility of taking a given action in a given state

Question 12

Q

What don’t Q-Learning agents need?

Answer

A

A model of the environment

Question 13

Q

What is a reflex agent?

Answer

A

An agent that learns a policy which maps directly from states to actions

Question 14

Q

How do utility based agents work out which action is most efficient?

Answer

A

With a utility function:
- Map each state after each action to a number
- This number represents how efficiently each action achieves the goal

Question 15

Q

How do reflex agents work?

Answer

A

Selects actions based on its current perception of the environment
Past experience not considered
Only one possibility is acted on

This is called a condition-action rule:
IF battery low THEN charge

Question 16

Q

What are some key features of Reinforcement learning?

Answer

Study These Flashcards

A

Delayed reward
Exploration vs Exploitation
Partially observable states
Life Long learning

Question 17

Q

What is Delayed Reward?

Answer

Study These Flashcards

A

Feedback only provided as the agent executes its sequence of actions

Question 18

Q

What is Exploration vs Exploitation

Answer

Study These Flashcards

A

A trade off between whether to explore the search space or to exploit known actions that get reward

Question 19

Q

What is a partially observable state?

Answer

Study These Flashcards

A

Sensors may only provide partial information
Actions may aim at improving observability

Question 20

Q

What is life-long learning?

Answer

Study These Flashcards

A

The possibility for an agent to use previous experience to guide it

Question 21

Q

What is the utility of a state?

Answer

Study These Flashcards

A

The expected total reward from that state onwards

Reinforcement Learning Flashcards

(21 cards)