Algorithms Flashcards

1
Q

SARSA vs Q

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SARSA Algorithm

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

REINFORCE Algorithm

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Q-learning update rule

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

SARSA update rule

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Boltzmann vs Softmax policy

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Boltzmann policy

High values of τ (e.g., τ = 5) move the probability distribution closer to a uniform distribution. This results in an agent acting very randomly. Low values of τ (e.g., 0.1) increase the probability of the action corresponding to the largest Q-value, so the agent will act more greedily. τ = 1 reduces to the softmax function

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DQN Algorithm

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly