Class 4 - Adaptation Flashcards

1
Q

3 types of adaptive processes

A
  1. Evolutionary algorithms
  2. Reinforcement learning
  3. Learning by demonstration
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Evolutionary robots…

A
  1. are created to adapt to their environment via evolutionary computing
  2. can generate off-springs
  3. mainly focus on evolving the brain
  4. can work on adjusting issues by themselves
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Genotypes

A

numbers that describe the phenotype of the evolutionary robot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Genotypes can be…:

A. discrete numbers between [0, 1]
B. continuous numbers between [0, 1]
C. both A and B
D. neither A nor B

A

C. both A and B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

T/F: Different parts of the genotype can describe different parts of the phenotype of the robot

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

An evolutionary algorithm can be split into two main phases…

A
  1. the “testing phase” –> where the robot is put in the environment and the fitness function is evaluated
  2. the “generation phase”, where the resulting fitness function is used to create the next generation of offsprings
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

3 methods to create off-springs (in the context of evolutionary algorithms)…

A
  1. genetic algorithm
  2. evolutionary strategy
  3. modern evolutionary strategy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Match the off-spring method (in the context of evolutionary algorithms) to its description…

A. genetic algorithm
B. evolutionary strategy
C. modern evolutionary strategy

  1. only allows continuous values in the genotype which are then crossed-over with mutation (which allows a number between 0 and 1) to create the genotype of the off-spring. The first generation of genotypes can be random and parents can outlive the children if they have a better fitness function.
  2. only allows discrete values in the genotype which are then crossed-over with (flipped value) mutation to create the genotype of the off-spring. Also, the first generation of genotypes can be random.
  3. Uses correlation to see whether the previous children were performing better than the new version. Only uses one parent since it is computationally challenging.
A

A-2
B-1
C-3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Advantages of evolutionary algorithms

A

If one component is weaker, the others can compensate!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Disadvantages of evolutionary algorithms

A
  1. unbounded complexity does not happen
  2. adaptations are short-term → need to fix problems immediately
  3. prevents exploring new ideas because offspring comes from a certain family
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

T/F: in RL, only the off-spring “carries” the improvement

A

False, that’s the case generally for evolutionary algorithms. In RL, the robot improves itself continuously given a reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Policy

A

explains to the robot which action to take in a certain state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Is the policy “ideal” before the robot starts exploring?

A

No, the robot updates the policy as it explores the environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Two types of policies

A
  1. deterministic

2. stochastic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Gaussian policy is an example of…

A. deterministic policy
B. stochastic policy

A

B. stochastic policy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In a deterministic policy…

A

the robot has n categories that it can use / pick from

17
Q

In a stochastic policy…

A

there is a condition on execution: when the robot is in one given field, there is i.e., 6 % chance that the robot will move to the right, 2 % it will move to the left, 1% it will move up and 1% it will move down.

18
Q

T/F: A gaussian policy is used when the action space is continuous

A

True

19
Q

Finite-horizon undiscounted return

A

summing all the rewards and picking the policy which corresponds to the highest sum

20
Q

Infinite-horizon discounted reward

A

summing all rewards and subtracting all steps taken. if the result is negative, it means that we took more steps than the total reward, which is bad; if the result is positive, it means that we got a high total result with little number of steps, which is what we want

21
Q

Q-learning

A
  1. the robot explores environment itself
  2. Q-learning is an off policy reinforcement learning algorithm that seeks to find the best action to take given the current state. It’s considered off-policy because the q-learning function learns from actions that are outside the current policy, like taking random actions, and therefore a policy isn’t needed.
22
Q

Monte-carlo approach

A

class of computational algorithms that rely on repeated random sampling to obtain numerical results.

23
Q

T/F: the “learning by demonstration” adaptation method has 3 main phases.

If TRUE, name them
If FALSE, give the exact number

A

True;

demonstration phase; training phase; testing phase

24
Q

In the context of learning by demonstration > > in the demonstration phase…

…Match the following types of demonstrations with their description

A. kinesthetic teaching
B. tele-operated teaching
C. direct imitation of human behavior

  1. imitate human’s behavior from observational human data
  2. teacher uses controller to show the robot the correct behavior
  3. teacher places the robot’s body in the correct position
A

C-1
A-3
B-2

25
Q

In the context of learning by demonstration, in which phase (demonstration / training / testing) does the robot actually learn from the observations?

A

Training phase