Deep Q Network MLM Flashcards
Deep Q Networks (DQN)
Deep Q Networks (DQN) are a type of Artificial Intelligence that combines the techniques of Deep Learning and Q-Learning, a model-free reinforcement learning algorithm.
- Introduction
Deep Q Network (DQN) is a variant of Q-Learning that uses deep neural networks to approximate the Q-value function, which helps an agent to learn how to play games by taking smart actions based on the state of the game.
- Neural Networks
In DQN, a neural network is used as a function approximator for the Q-value function. The input to the network is the current state of the game, and the output is the corresponding Q-value for each possible action in that state.
- Experience Replay
DQN uses a technique called experience replay where past transitions are stored into a replay memory. During training, minibatches of transitions are sampled from this memory to update the Q-values. This approach breaks the correlation between consecutive samples, stabilizing the training process.
- Target Network
DQN also incorporates a technique known as a target network, which is a copy of the main network but with its weights frozen. The target network is used to calculate the target Q-value during updates, providing more stable learning targets.
- Epsilon-Greedy Policy
DQN typically employs an epsilon-greedy policy for exploration, where the agent occasionally takes a random action instead of the one with the highest estimated Q-value. This balance between exploration and exploitation allows the agent to learn a more robust policy.
- Reward and Punishment
DQN agents learn from both positive rewards and punishments. If an action leads to a higher score in a game, for example, it receives a positive reward. On the other hand, if an action causes the game to end, the agent receives a punishment.
- Challenges
While DQNs have shown remarkable success, particularly in learning to play video games from raw pixel inputs, they are not without challenges. They can be sample inefficient, meaning they require a lot of experience (gameplay) to learn effectively. Also, the choice of reward function can greatly impact the agent’s learning, and designing these reward functions can be non-trivial.
- Applications
DQN has been notably used by Google’s DeepMind to train an AI to play Atari games to a superhuman level, directly from raw pixel inputs.