Class 6 - Guest lecture - Deep Reinforcement learning Flashcards
Difference between reinforcement learning deep learning
The difference between them is that deep learning is learning from a training set and then applying that learning to a new data set, while reinforcement learning is dynamically learning by adjusting actions based in continuous feedback to maximize a reward.
Traditional methods for robotics
work well for demos and narrow applications, but they don’t generalize well and require expensive and tedious adaptation to any new task or environment. (i.e., Boston robotics)
Deep Learning for Robotics
has proven effective to achieve (super)humans-level performance on many tasks:
- object detection and recognition of faces
- speech recognition
- dexterous manipulation
- still in the very early stages
How to apply DL to robotics options:
- “easy” fix (what is it about)?
- “harder” fix (what is it about)?
- easy fix: replace some components with neural networks, BUT we still have to engineer the entire system, and design (and train) the different components separately (issues that may arise: mistakes in pipeline, not great general movements)
- hard fix: end-to-end learning, automatic learning technique where the model learns all the steps between the initial input phase and the final output result. It takes the input and returns a distribution over action.
The reinforcement approach to learning a solution…(pick one):
A. uses simulations to train the agent
B. places the agent in an environment and lets it explore this environment by performing actions which will cause a new state and reward for the agent.
B
A solution to the fact that reinforcement learning generally requires a lot of time and a lot of repetitions is…
to use simulations to train thousands of agents in parallel
Although deep learning has proven to be more robust against perturbations during training / testing, it still reports one main issue, that is…
even the best simulations are too different from reality
reality gap (in the context of deep learning)
you might lose a lot when moving from simulation to reality
Fill in:
In the context of deep learning simulations, a small error compound at each time step might result in very - similar / different - trajectories between simulation and real world.
different
One approach that tries to solve the reality gap issue is the…
Sim2Real approach
The Sim2Real approach uses dynamics randomization to…
train robots in simulation using a wide range of physics (e.g., amount of gravity, size of each component of the robot, frictions, visual appearance and lights, etc…) to force the robot to work over many different environments, with the hope that the real world ends up included.
dis: very computationally expensive
One big problem in reinforcement learning is that (multiple picks are possible):
A. we have to design the reward function by hand
B. the reward function is always the same
C. the reward function we choose may not result in the behavior we want
A, C
The idea behind imitation learning is to…
collect demonstrations from humans solving the target task (in the demonstration phase), and use them to train an agent (training phase + test phase).
3 main approaches to imitation learning in the context of deep learning
- behavior cloning
- inverse reinforcement learning
- sequence modelling
In the context of imitation learning in deep learning, behavior cloning…
- treats the problem as supervised learning.
- Collects (state, action) pairs from many demonstration episodes.
- Trains a neural network to produce the same actions on the same states.