pseudo codes Flashcards
1
Q
Policy evaluation MDP
A
2
Q
Policy iteration MDP
A
3
Q
Value iteration
A
4
Q
Policy evaluation MC
A
5
Q
Exploration starts
A
6
Q
Epsilon-soft policies MC
A
7
Q
Policy evaluation MC off-policy
A
8
Q
Policy evaluation TD0
A
9
Q
Sarsa
A
10
Q
Q-learning
A
11
Q
Expected Sarsa
A