8 - Value Function Approximation Flashcards

Question 1

Q

List a couple of advantages of Value Function Approximation methods?

Answer

A

One advantage is that we don’t have to explicitly store or learn for every single state a reward model, value, state-action value, or policy. It allows us a more compact representation that can generalize across states and actions, thus reducing memory, computation, and experience needs. (We assume this is reasonable if we have a “smooth structure”).

Question 2

Q

Neural Networks / Deep Learning is the most popular differentiable function approximator. What is the second most popular differentiable function approximator?

Answer

A

Linear Feature Representations (ie - linear combinations of features)

Question 3

Q

Gradient descent is guaranteed to find a _________ {local, global} optima.

Answer

A

Gradient descent is guaranteed to find a local optima.

Question 4

Q

What does the linear part of Linear Value Function Approximation mean?

Answer

A

This means that we represent a value function (or state-action value function) for a particular policy using a weighted linear combination of features.