Game Playing AI Flashcards

Question

Why does the DQN agent use three consecutive frames as input instead of a single frame?

Answer 1

Using three frames helps the agent understand temporal changes, such as the direction of the ball's movement in a game like "Breakout."

Answer 2

Epsilon controls the balance between exploration (random actions) and exploitation (using the Q-function). Initially, epsilon is high (more exploration), but it decreases over time, favoring the Q-function's decisions.

Answer 3

The Y channel represents brightness and simplifies the input by converting the frame to greyscale, which reduces computational complexity while retaining important visual information.

Answer 4

Atari games cause flickering by alternating sprite frames. DQN handles this by combining two game frames into a single input to provide a clearer visual representation.

Answer 5

Convolutional layers are crucial because they process image data efficiently, allowing agents to interpret visual input from games, such as pixel-based states.

Answer 6

Convolution applies a filter to an image, modifying each pixel's value based on its surrounding pixels to extract useful features like edges or patterns.

Answer 7

A convolutional layer applies a set of trainable filters to input data (such as images), learning how to extract important features during training.

Answer 8

Key properties include: Filters: The number of filters applied in parallel. Kernel size: The dimensions of each filter. Stride: The step size used when sliding the filter over the input.

Answer 9

The DQN agent architecture is expressed as a neural network with convolutional layers for processing visual inputs, followed by fully connected layers to make decisions. Input frames from the game are processed through the network to predict the Q-values for each action.

Answer 10

Convolutional layers in a DQN network extract features from the input game frames, such as object edges or motion, which help the agent understand the game environment.

Answer 11

The input consists of a series of game frames (e.g., 84x84 pixel images) that provide a historical context to the network. This helps the network learn the game dynamics and predict future states.

Answer 12

The network outputs Q-values for each possible action, and the agent selects the action with the highest Q-value, indicating the most promising move based on current knowledge.

Answer 13

DQN agent performance is typically evaluated by monitoring its progress over episodes, tracking cumulative rewards, and comparing its performance to a baseline or random agent.

Answer 14

Multiple frames provide a sense of movement and temporal context, allowing the agent to understand object velocities and directions in games that involve dynamic environments.

Answer 15

An episode refers to a complete run of the game from start to finish, while frames are individual visual inputs fed into the neural network during the agent’s decision-making process.

Answer 16

The trend towards general AI is driven by the desire to create systems that can solve a wide variety of tasks without being specifically programmed for each one. General AI has the potential to transform industries by automating complex decision-making, improving productivity, and solving problems that require human-level reasoning.

Answer 17

General AI could revolutionize industries by automating complex tasks, improving healthcare, enhancing scientific research, and personalizing education. However, it also poses ethical challenges like job displacement, inequality, privacy concerns, and the potential misuse of powerful AI systems.

Answer 18

Human-competitive AI refers to AI systems capable of outperforming human players in complex games like StarCraft II or Dota 2. These AIs use advanced techniques, such as deep reinforcement learning, to learn strategies that rival or surpass those of professional human players.

Answer 19

Notable examples include: AlphaStar: DeepMind's AI that defeated professional players in StarCraft II. OpenAI Five: An AI developed by OpenAI that defeated the world champion team in Dota 2. Agent57: A reinforcement learning AI that outperforms human benchmarks across multiple Atari games.

Answer 20

Human-competitive AIs gather experience by interacting with the game environment repeatedly, storing their experiences in replay buffers, and using reinforcement learning techniques to improve decision-making over time. They learn by maximizing rewards, refining strategies through trial and error.

Answer 21

Fairness in AI vs. human competitions involves ensuring that both sides have equal opportunities. This includes factors such as equal access to training data, similar computational power, and comparable motor and perceptual abilities, to create a level playing field between AI and human competitors.

Answer 22

The six dimensions are: Perceptual: Equal input capabilities. Motoric: Equal output capabilities. Historic: Equal time spent in training. Knowledge: Equal access to declarative knowledge. Compute: Equal computational resources. Common-sense: Equal understanding of the broader context outside the game.

Answer 23

Human-competitive AIs have influenced human players by introducing new strategies, shifting how games are played. In some cases, AI dominance has led professional players to retire, feeling that they cannot surpass the AI's performance, as seen in Go after the advent of AlphaGo.

Answer 24

AlphaStar outperforms human players by using deep reinforcement learning to gather large amounts of experience through self-play. It learns complex strategies over millions of iterations and can execute precise, high-speed decisions that surpass human reaction times.

Answer 25

OpenAI Five demonstrated AI’s ability to compete with top-tier human teams, defeating the world champions OG in back-to-back games. This achievement showed that AI could master a complex, real-time strategy game like Dota 2, which requires teamwork, planning, and fast decision-making.

Answer 26

AIs like Agent57 improve their understanding by learning from a wide variety of experiences in different environments, using techniques like distributed reinforcement learning and recurrent neural networks to generalize across multiple games and adapt to unseen situations.

Answer 27

The Mario AI competition aimed to encourage the development of AI that could navigate and solve platformer games like Mario. Competitors designed AIs capable of playing levels with varying difficulty, using strategies ranging from simple rule-based approaches to deep learning techniques.

Answer 28

DreamerV2 uses world models that simulate the environment internally, allowing the agent to plan actions by predicting future states, unlike traditional DQNs that rely on direct experience and a replay buffer to improve their policies.

Answer 29

Fairness is debated because AI systems often have advantages in terms of computational power, training time, and access to resources that humans do not. These disparities make it difficult to create a truly fair competition between AIs and humans in these games.

Answer 30

Rainbow DQN combines several advanced techniques, such as double Q-learning, prioritized experience replay, and dueling networks, to enhance stability and learning efficiency, resulting in improved performance across multiple Atari games compared to traditional DQNs.

Game Playing AI Flashcards

The first case study (54 cards)