Lecture 2 - Machine Learning Flashcards

1
Q

Learning

A

Acquiring new knowledge or skills and improve one’s performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A robot can learn about .. (2)

A
  1. Itself: sensors or actuator info that might vary over time
  2. Its environment: Learning maps, how to achieve goal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

(3) benefits of learning

A
  1. Enabling to perform its task better
  2. Adapting to changes in environment (hard to preprogram)
  3. Simplify designer’s programming work
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 Forms of Learning

1. Supervised learning

A

With external supervisor/teacher

In- and output pairs are presented & the function between these pairs is learned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

3 Forms of Learning

2. Unsupervised learning

A

All information should be taken from the inputs alone

-> it can be useful to preprocess inputs, e.g. divide in meaningful clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

3 Forms of Learning

3. Reinforcement learning

A

With an evaluation signal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is feedback in supervised learning?

A

The target action or output for a specific input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Example of supervised learning + name

A

Neural network learning

-> the weights of the connections between nods are learned (connectionist learning)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How did ALVINN learn to drive?

A
ALVINN steers how it think it should
Humans show how it should steer
ALVINN computes the error
ALVINN uses this error to update weights
-> REPEAT
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Disadvantage of supervised learning (3)

A
  1. A trainer is needed (less autonomous)
  2. Not online (first training then operating phase)
  3. Not incremental (when it is operating, it is not learning)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

4 Characteristics of Reinforcement Learning

A
  1. Learning from interaction
  2. Goal-oriented learning
  3. Learning about, from, and while interacting with an external environment.
  4. Learning what to do (how to map situations to actions) so as to maximize a numerical reward signal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

5 Key features of Reinforcement Learning

A
  1. Learner is not told which actions to take
  2. Trial and Error search
  3. Possibility of delayed reward
  4. Sacrifice short-term gains for greater long-term gains
  5. The need to explore and exploit
  6. Considers the whole problem of a goal-directed agent interacting with an uncertain environment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

4 Characteristics of a complete agent

A
  1. It is temporally situated
  2. It learns and plans continually
  3. The object is to affect the environment
  4. The environment is stochastic / uncertain
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

4 Elements of Reinforcement Learning (inwards to outwards)

  1. Policy
  2. Reward
  3. Value
  4. Model
A
  1. Policy: What to do?
  2. Reward: What is good?
  3. Value: What is good because it predicts reward?
  4. What follows what?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Actuator space

A

Set of all possible actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When the robot knows/learned which action to perform in each state it has learned a …

A

Reactive controller

17
Q

Exploration (RL)

A

In order to learn the optimal action, the robot has to try everything (trial and error)

18
Q

Exploitation (RL)

A

Simultaneously to exploration, the robot should perform well and exploit what it has learned.

19
Q

Once mapping between inputs and actions is learned, the robot can just exploit the learned knowledge and stop exploring.. right?

A

No, not always. There might be (1) sensor errors (uncertainty) and (2) a changing environment.

20
Q

Exploitation/Exploration dilemma (Trade off between..(2))

A
  1. Constantly learning - Exploration (possibly doing things less perfectly)
  2. Constantly using what it knows (Exploitation) (Can not improve predictions of other actions)
21
Q

What is learned in RL? Consider robot’s actuator and sensor space!

A

The robot learns a value-function (possibly in table) with all possible state-action pairs along with their Q-values.

22
Q

Q-value

A

Grows if good things happen and shrinks if bad things happen.

23
Q

When is the RL table learning method efficient (2)?

A
  1. When the state space is not too big

2. States and actions are discrete

24
Q

What if table learning method alone is inefficient?

A

Combine RL with function approximators such as neural networks

25
Q

Temporal Credit Assignment (RL)

A

In a maze, result of tested state-action pair may come long after the action
-> Rewards and punishment have to be backpropagated and given multiple previous state-action pairs.

26
Q

Notable RL Applications (2)

A
  1. TD-Gammon: World’s best backgammon program

2. Elevator control

27
Q

Learning by Imitation
What does it free from?
What does it involve?

A

It frees from trial and error but is not trivial!

It involves careful decisions about internal representations

28
Q

Learning from Demonstration

A

Learning by experiencing a task directly (human controls the robot to let it experience the result of good actions)

29
Q

What does the robot need to learn in imitation/demonstration learning (2)?

A
  1. What it experienced during trying

2. How it can generate that behavior again

30
Q

Why is forgetting important (2)?

A
  1. Making room for new information

2. Replacing old information that is no longer correct

31
Q

What determines which type of learning methods are possible for a particular learning problem?

A

Amount and type of information (feedback, reward, punishment, error)

32
Q

Can a robot use multiple learning methods at the same time?

A

YES!