Reinforcement Learning Flashcards

1
Q

What is the formal definition of an agent?

A

An entity that has a set of sensors to observe the state of its environment, and a set of actions it can perform to alter the state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the task of an agent?

A

To learn a control strategy (policy) for choosing actions to achieve its goals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do we provide reinforcement to an agent?

A

By rewarding it with a positive score for actions taken towards reaching the goal, and negative score for actions away from the goal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why can sparse reward spaces result in a failure of reinforcement learning?

A
  • Number of steps to gain reward too high
  • Random choice is computationally inefficient
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is reward shaping?

A

Manually designing a reward function.

This guides the policy to the final goal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some drawbacks of reward shaping?

A
  • Ad-hoc process depending on environment
  • New reward function for every problem
  • Agent may learn to maximise reward without achieving goal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the law of unintended concequences?

A
  • Unexpected benefit
  • Unexpected drawbacks
  • Perverse result
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What should an agent learn to do?

A

It should learn to choose actions that maximise the reward gained from that action

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a utility based agent?

A

It learns a utility function on states and uses it to select actions that maximise the expected outcome utility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What do utility based agents need?

A

A model of the environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Q-Learning?

A

An agent that learns an action-utility function given the expected utility of taking a given action in a given state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What don’t Q-Learning agents need?

A

A model of the environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a reflex agent?

A

An agent that learns a policy which maps directly from states to actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do utility based agents work out which action is most efficient?

A

With a utility function:
- Map each state after each action to a number
- This number represents how efficiently each action achieves the goal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do reflex agents work?

A
  • Selects actions based on its current perception of the environment
  • Past experience not considered
  • Only one possibility is acted on

This is called a condition-action rule:
IF battery low THEN charge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are some key features of Reinforcement learning?

A

Delayed reward
Exploration vs Exploitation
Partially observable states
Life Long learning

17
Q

What is Delayed Reward?

A

Feedback only provided as the agent executes its sequence of actions

18
Q

What is Exploration vs Exploitation

A

A trade off between whether to explore the search space or to exploit known actions that get reward

19
Q

What is a partially observable state?

A

Sensors may only provide partial information
Actions may aim at improving observability

20
Q

What is life-long learning?

A

The possibility for an agent to use previous experience to guide it

21
Q

What is the utility of a state?

A

The expected total reward from that state onwards