Reinforcement Learning Flashcards

Question 1

Q

Set of states that contains the state that the agent may be in

Answer

A

Belief states

Question 2

Q

If environment is deterministic, actions taken by the agent should result to a belief state with __ size compared to original

Answer

A

Lesser / equal

Question 3

Q

If environment is STOCHASTIC, actions taken by the agent should result to a belief state with ___ size compared to the original

Answer

A

GREATER SIZE

Question 4

Q

Repeating acting of]f one action from one solution path and observing the local environment is called

Answer

A

Act observe cycle

Question 5

Q

Can learn to EXPLORE THE TERRITORY, learn WHERE THE REWARDS ARE, and then LEARN THE OPTIMAL POLICY

Uses OBSERVED REWARDS/PUNISHMENTS to learn an OPTIMAL POLICY for an environment

Answer

A

Reinforcement learning

Question 6

Q

Type of RL that has a fixed policy to execute and learns the reward function and policies while executing the fixed policy

Answer

A

PASSIVE RL

Question 7

Q

Type of rl that changes its policy as it looks for the reward function and optimal control

Answer

A

Active rl

Question 8

Q

Formula of U[S]

Answer

A

U[S] + ( learning rate * ( rewards[S] + discount factor * U[S’] - U[S] ))

Learning rate = 1 / N[S]+1

Question 9

Q

Active reinforcement learning algo that CHANGES THE CONTROL POLICY AFTER K ITERATIONSof the temporal diff learning

Answer

A

Greedy reinforcement learning

Question 10

Q

Agent knows nothing except WHAT IS LOCALLY AVAILABLE;

AGENT EXPLORES SURROUNDINGS CHECKING FOR REWARDS BASED IN GOALS

Answer

A

Reinforcement learning

Question 11

Q

Allows us to make sense of previous data

Answer

A

Machine learning

Question 12

Q

A passive RL algorithm that moves from one state to another

Takes note of the difference between 2 states and computes for the values of each state depending on WHERE THEY LEAD

Answer

A

Temporal difference learning

Question 13

Q

Agent knows nothing except what is locally available

Agent explores surrounding, checking for rewards and penalties then learns what to do

Answer

A

Reinforcement learning

Question 14

Q

Agent finds the program / control policy for a given problem using the data supplied to the agent

Answer

A

Machine learning

Reinforcement Learning Flashcards

(14 cards)