Game Playing Flashcards
Definition of games
zero-sum, discrete, finit, determinist, perfect information
zero-sum
one player’s gain is the other player’s loss; does not mean fair
perfect information
each player can see the complete game state; no simultaneous decisons
game playing as search
states - board configuration
actions - legal moves
initial state - starting board configuration
goal state - game over/terminal board configuration
Utility function
used to map each terminal state of the board to a score indicating the value of that outcome to the computer
Greedy Search Using an evaluation function
expand the search tree to the terminal states on each branch; evaluate the utility of each terminal board configuration; make the initial move that results in the board configurations with the maximum value
problem with greedy search using eval function
ignores what the opponent might do
Minimax principle
Assume both players play optimally; computer should choose maximizing moves (high utility), opponent = minimizing moves (low utility)
Minimax principle
computer chooses the best move considering both its move and the opponent’s optimal move
Minimax Propagation Up Tree
Start at leaves; assign value to parent - min if opponent’s move; max if computer’s move
General Minimax Algorithm
For each move done by computer:
perform DFS, stop at terminal states, evaluate each terminal state, propagate upwards the minimax values; choose the move at root with the maximum of the minimax values of its children
Time/space complexity of minimax algorithm
space O(bd) (DFS); time: O(b^d)
Static Evaluation function
impractical to search game space to all terminal states and apply utility function; SEF - use heuristics to estimate the value of non-terminal state
SBE used for
estimating how good the current board configuration is for the computer; reflects chances of winning from that node; must be easy to calculate
how is SBE calculated
one subtracts how good it is for the opponent from how good it is for the computer; SBE should agree with the Utility function when calculated at terminal nodes
Minimax Algorithm
MaxValues if (terminal state or depth limit); return SBE value of s else v = -infinity for each s in successors(s) v = max (v, Min-Values)
Alpha Beta Idea
At maximizing levels; keep track of highest SBE value seen so far in each none
At minimizing levels, lowest SBE value seen so far in tree below
Alpha Cutoff
At each min node, keep track of the minimum value return so far from its visited children, store in v; each time v is updated, check its value against the alpha value of all its Max node ancesters; if alpha >= v don’t visit any more of MIN node’s children
Beta Cutoff
at each max node, keep track of the maximum value returned so far from its visited children, store in v; each time v is updated at a max node, check its value against the beta value of all its MIN node ancestors; if v >= beta don’t vist any more of the current Max node’s children