Algorithms Flashcards by Pier-Olivier Marquis

SARSA vs Q

How well did you know this?

Not at all

Perfectly

SARSA Algorithm

How well did you know this?

Not at all

Perfectly

REINFORCE Algorithm

How well did you know this?

Not at all

Perfectly

Q-learning update rule

How well did you know this?

Not at all

Perfectly

SARSA update rule

How well did you know this?

Not at all

Perfectly

Boltzmann vs Softmax policy

How well did you know this?

Not at all

Perfectly

Boltzmann policy

High values of τ (e.g., τ = 5) move the probability distribution closer to a uniform distribution. This results in an agent acting very randomly. Low values of τ (e.g., 0.1) increase the probability of the action corresponding to the largest Q-value, so the agent will act more greedily. τ = 1 reduces to the softmax function

How well did you know this?

Not at all

Perfectly

DQN Algorithm

How well did you know this?

Not at all

Perfectly