C8 Flashcards

1
Q

what is hierarchical reinforcement learning?

A

the granularity of abstractions is larger than the fine grain of the primitive actions of the environment (taking a train instead of individual steps)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

advantages of hierarchical methods

A
  1. simplify problems through abstraction. Agent creates subgoals and solves these fine grain tasks first. Actions are abstracted into macro actions to solve these subtasks
  2. increased sample efficiency: subpolicies are learned to solve subtasks, reducing the environment interactions. Subtasks can be transferred to other problems
  3. policies become more general, and are able to adapt to changes in the environment more easily
  4. the higher level of abstraction allows agents to solve larger, more complex problems
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

disadvantages of hierarchical methods

A
  1. Many assume that domain knowledge is available to subdivide the environment so that hierarchical RL can be applied
  2. algorithmic complexity: identify subgoals, learn subpolicies etc.
  3. macros are combinations of actions and the number of combinations of actions is exponential in their length, so computational complexity of the planning and learning choices increases by the introduction of the macro actions
  4. the quality of a behavioral policy that includes macro-actions may be less than that of a policy consisting only of primitive actions, because they may skip over possible shorter routes, that the primitive actions would have found
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the options framework?

A

Whenever a state is reached that is a subgoal, then, apart from following a primitive action (main policy), you can follow the option policy, a macro action consisting of a different subpolicy specially aimed at satisfying the subgoal in one large step. In this way macros are incorporated into the reinforcement learning framework.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is an option?

A

a group of actions with a termination condition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is an option?

A

a group of actions with a termination condition. They take in environment observations and output actions until a
termination condition is met.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the tree elements of an option πœ”?

A
  1. initialization set I_πœ”: the states that the option can start from
  2. subpolicy πœ‹πœ” (π‘Ž|𝑠): internal to this particular option
  3. terminal condition π›½πœ” (𝑠): tells us if πœ” terminates in s
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are macros?

A

any group of actions, possibly open-ended

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is intrinsic motivation?

A

An inner drive to explore, named so to contrast it with classic extrinsic motivation (the conventional RL reward signal). They are related to reward signals for achieving subgoals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do multi agent and hierarchical reinforcement learning fit together?

A

agents often work together in teams or other hierarchical structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is so special about Montezuma’s Revenge?

A

it is a difficult situation to learn for RL, because is has little reward signal and the reward signal is delayed. It consists of long stretches in which the agent has to walk without the reward changing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly