Lecture 2 - Machine Learning Flashcards
Learning
Acquiring new knowledge or skills and improve one’s performance
A robot can learn about .. (2)
- Itself: sensors or actuator info that might vary over time
- Its environment: Learning maps, how to achieve goal
(3) benefits of learning
- Enabling to perform its task better
- Adapting to changes in environment (hard to preprogram)
- Simplify designer’s programming work
3 Forms of Learning
1. Supervised learning
With external supervisor/teacher
In- and output pairs are presented & the function between these pairs is learned
3 Forms of Learning
2. Unsupervised learning
All information should be taken from the inputs alone
-> it can be useful to preprocess inputs, e.g. divide in meaningful clusters
3 Forms of Learning
3. Reinforcement learning
With an evaluation signal.
What is feedback in supervised learning?
The target action or output for a specific input.
Example of supervised learning + name
Neural network learning
-> the weights of the connections between nods are learned (connectionist learning)
How did ALVINN learn to drive?
ALVINN steers how it think it should Humans show how it should steer ALVINN computes the error ALVINN uses this error to update weights -> REPEAT
Disadvantage of supervised learning (3)
- A trainer is needed (less autonomous)
- Not online (first training then operating phase)
- Not incremental (when it is operating, it is not learning)
4 Characteristics of Reinforcement Learning
- Learning from interaction
- Goal-oriented learning
- Learning about, from, and while interacting with an external environment.
- Learning what to do (how to map situations to actions) so as to maximize a numerical reward signal
5 Key features of Reinforcement Learning
- Learner is not told which actions to take
- Trial and Error search
- Possibility of delayed reward
- Sacrifice short-term gains for greater long-term gains
- The need to explore and exploit
- Considers the whole problem of a goal-directed agent interacting with an uncertain environment
4 Characteristics of a complete agent
- It is temporally situated
- It learns and plans continually
- The object is to affect the environment
- The environment is stochastic / uncertain
4 Elements of Reinforcement Learning (inwards to outwards)
- Policy
- Reward
- Value
- Model
- Policy: What to do?
- Reward: What is good?
- Value: What is good because it predicts reward?
- What follows what?
Actuator space
Set of all possible actions