Reinforcement Learning Flashcards

1
Q

Set of states that contains the state that the agent may be in

A

Belief states

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If environment is deterministic, actions taken by the agent should result to a belief state with __ size compared to original

A

Lesser / equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If environment is STOCHASTIC, actions taken by the agent should result to a belief state with ___ size compared to the original

A

GREATER SIZE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Repeating acting of]f one action from one solution path and observing the local environment is called

A

Act observe cycle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Can learn to EXPLORE THE TERRITORY, learn WHERE THE REWARDS ARE, and then LEARN THE OPTIMAL POLICY

Uses OBSERVED REWARDS/PUNISHMENTS to learn an OPTIMAL POLICY for an environment

A

Reinforcement learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Type of RL that has a fixed policy to execute and learns the reward function and policies while executing the fixed policy

A

PASSIVE RL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Type of rl that changes its policy as it looks for the reward function and optimal control

A

Active rl

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Formula of U[S]

A

U[S] + ( learning rate * ( rewards[S] + discount factor * U[Sā€™] - U[S] ))

Learning rate = 1 / N[S]+1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Active reinforcement learning algo that CHANGES THE CONTROL POLICY AFTER K ITERATIONSof the temporal diff learning

A

Greedy reinforcement learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Agent knows nothing except WHAT IS LOCALLY AVAILABLE;

AGENT EXPLORES SURROUNDINGS CHECKING FOR REWARDS BASED IN GOALS

A

Reinforcement learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Allows us to make sense of previous data

A

Machine learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A passive RL algorithm that moves from one state to another

Takes note of the difference between 2 states and computes for the values of each state depending on WHERE THEY LEAD

A

Temporal difference learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Agent knows nothing except what is locally available

Agent explores surrounding, checking for rewards and penalties then learns what to do

A

Reinforcement learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Agent finds the program / control policy for a given problem using the data supplied to the agent

A

Machine learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly