Game Playing AI Flashcards
The first case study
What historical figure(s) contributed to early AI game playing systems in the 1950s?
Claude Shannon and Alan Turing worked on early chess-playing systems in the 1950s.
What notable AI defeated a world champion in chess, and when?
Deep Blue, developed by IBM, defeated world chess champion Garry Kasparov in 1997.
What AI agent achieved human-level performance on Atari games in 2015?
DeepMind’s Deep Q Network (DQN) agent.
Why are games considered interesting for AI research?
Games are difficult to solve and present challenges in decision-making, learning, and strategy, making them an ideal testing ground for AI.
What is Moravec’s Paradox, and how does it relate to game-playing AI?
Moravec’s Paradox states that tasks humans find difficult (like chess) are easy for AIs, while tasks humans find easy (like vision and movement) are hard for AIs. This influences the design of generalized game-playing AIs.
What are some commercial motivations for game-playing AIs?
Game-playing AIs can be used for playtesting, saving time and money by automating game testing processes and allowing multiple play sessions simultaneously.
What approach is most effective for creating game-playing AIs?
Reinforcement learning is most effective, as it allows the AI to learn by interacting with the game environment and improving based on rewards.
What are the three main components of a reinforcement learning algorithm in a game-playing AI?
- States: Input data like game pixels, score, and game-over status.
- Actions: Movements such as left, right, or doing nothing.
- Rewards: Points and game completion.
What is the role of OpenAI Gym in training game-playing AIs?
OpenAI Gym provides a unified interface for AIs to interact with multiple tasks, helping in training and testing game-playing agents.
What are the ethical considerations involved in building game-playing AIs?
Concerns include potential job displacement in game testing and broader ethical issues related to AI surpassing human capabilities in decision-making and learning.
When is an AI considered to have “beaten” a game?
When the AI consistently defeats human players and is publicly available for anyone to play against over a long period of time.
What is the purpose of competitions in AI game playing research?
Competitions provide a way to compare different AIs on a standardized game, encouraging innovation and improving the quality of AI systems.
What is reinforcement learning?
Reinforcement learning is how AI agents can learn what to do in the absence of labelled data.
What are the key components of a Markov Decision Process (MDP) used in game AI?
States, actions, rewards, and the state transition matrix.
How can a game like “Breakout” be described using a Markov Decision Process?
In “Breakout,” states are the frames (pixels), actions are moving the paddle left, right, or doing nothing, and rewards are the points earned for breaking bricks.
What does it mean when a problem is “stochastic” in the context of game AI?
Stochastic means there is a probabilistic element in the system. For example, the ball in “Breakout” could move in different directions from the same initial state.
What is Q-learning, and why is it necessary?
Q-learning is a reinforcement learning algorithm that approximates the value function. It is necessary because in many cases, like Atari games, the complete state transition matrix and rewards are unknown.
What is the Q-function in Q-learning?
The Q-function approximates the value of taking an action in a given state while considering future rewards.
What is the Bellman equation used for in reinforcement learning?
The Bellman equation calculates the value of a given action in a state, considering immediate rewards and future discounted rewards. (See Week 2 Notes for the full equation)
What are the two main challenges addressed by the DQN agent architecture?
- The lack of a state transition matrix.
- The absence of an action policy.
What is the replay buffer in a DQN agent, and what is its purpose?
The replay buffer stores the agent’s experiences (state, action, reward, next state). It helps the agent learn by revisiting past interactions with the environment.