Reinforcement Learning Flashcards
Base components of every RL framework
Action, environment(state), reward
What is a discount rate?
The discount factor essentially determines how much the reinforcement learning agents cares about rewards in the distant future relative to those in the immediate future.
What is a policy?
A factory mapping states to actions, Often denoted using PI
What is a greedy policy?
A policy where the agent always chooses the best expected return
What is a discrete space?
Discrete spaces has finite states and finite actions
What is a continuous space?
Continuous spaces can have a wide range of numbers and most physical space actions are continuous by nature
When do you want to use MDPs like Monte carlo/TD-learning?
For finite spaces where actions are limited
When do you want to use Deep reinforcement learning?
For continuous space tasks