9 - Deep RL Flashcards

1
Q

Deep Neural Networks (DNN) are function approximators that are a composition of a number of _______________ {features, functions}.

A

Deep Neural Networks (DNN) are function approximators that are a composition of a number of functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The nonlinearity in a DNN is called the ___________ function.

A

The nonlinearity in a DNN is called the activation function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why are Convolutional Neural Network (CNN) popular in Deep Learning and Deep Reinforcement Learning?

A

They are used throughout computer vision tasks since we can reduce the space-time complexity and include more of the surrounding local structure whereas in a traditional neural network we have trouble with.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a stride for a CNN?

A

A stride is the step size that a filter or mask take when moving across an array of input pixels. For example, a stride of length 1 means we move a 5x5 mask over 1 pixel length at a time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 3 elements of the Deadly Triad?

A

1) Function approximation
2) Bootstrapping
3) Off-policy learning
…when all combined can lead to learning that diverges in their value estimates and become unbounded.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

For DQN in Atari, what is the specific s in Q(s, a)?

A

Input state s is a stack of raw pixels from the last 4 (or whatever chosen number of) frames.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Experience Replay in DQNs?

A

Experience Replay is one method that is used to help us overcome unstable learning. DNNs easily overfit on current episodes. To solve this problem, Experience Replay stores experiences including state transitions, rewards and actions, and makes mini-batches to update the neural net. It reduces correlation between experiences in updating the DNN, increases learning speed with mini batch, and reuses past transitions to avoid catastrophic forgetting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In the seminal 2015 paper by Mnih et al, the DQN model was better than humans on ______________(none, some, most, all) Atari games.

A

In the seminal 2015 paper by Mnih et al, the DQN model was better than humans on most Atari games.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

According to the ablation study, what as the most important study feature?

A

The most important study feature was found to be Replay (an “ablation” study is when some part of the model or algorithm is removed and we look at whether the performance changes).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly