Week 4 Flashcards
Minimax search
Minimax search is depth first search on the game tree.
1. The graph is built in real time
2. The goal is to evaluate the children of the current node to find the best node
When using mini-max, should you load the entire game tree?
No, this puts too much of a limit on the size of trees which can be searched.
Use depth-first search instead
How can we remove the assumption of two-players from minimax?
By treating all other players as a single MIN player.
- No longer produces an equilibrium
- Assumes the opponents are working together to gang up on Player 1
- Still produces worst-case analysis for layer 1
Why does treating all opponents in minimax as one player not produce an equilibria?
Suppose you are player 1, and player 2 has two possible moves:
1. Move A: a move which causes player 3 to win
2. Move B: a move which continues the game
Player 2 will always make move B
Minimax (Assuming Player 2 and 3 work together), assumes player 2 will make move A
How can we remove the assumption of no chance from the minimax algorithm
Treat the value of a chance node as the expectation of all its children. Sometimes called “expectimax”
Uses a method called “Counterfactual Regret Minimization”, beyond the scope of this course
Win-loss solve
Value of a MAX node:
- WIN if any of it’s children are WIN nodes
- LOSS if all of its children are LOSS nodes
Value of a MIN node:
- WIN if all of it’s children are WIN nodes
- LOSS if any of its children are LOSS nodes
Why can win-loss solve be much faster than minimax?
It doesn’t have to search the entire tree
What is alpha-beta pruning
Each node contains a range [alpha,beta] where:
- alpha is the maximum lower bound of the value of the node
- beta is the minimum upper bound of the value of the node
- where beta <= alpha prune the subtree containing that node
- Start with the range [-inf,inf]
- Only MAX nodes can change alpha; only MIN nodes change beta
- The range is passed down the tree unchanged
- A child can change the range of its parent
What do we do for large games using minimax
Use an “evaluation function” (“heuristic”) to approximate the value of the deepest nodes we can reach, if they are not terminal nodes
What would be a good heuristic for tic-tac-toe?
Number of lines where X could win - number of lines where O could win
How to find good heuristics for minimax?
Admissible and monotonic does not apply.
- Requires knowledge of the game
- Need to make sure the heuristic is correctly normalised against values returned at actual terminal nodes
Often trial and error is used:
- Player them against each other in multiple games (from both starting positions)
- Play them both against other heuristics
- The one that wins the most is likely better
How to find the best weights for a linear combination of a set of k heuristics?
Hand tweaking
- Often done by guess work
Stochastic hillclimber
- Set random weights with a zero-mean over and over and find the combination that evaluates the best
What is the standard way to play two-player, perfect information games?
Use a good mini-max tree search algorithm with effective pruning.
Use effective heuristic evaluation function
Major issue: Finding good heuristic evaluation functions
What are some possible extensions to the minimax algorithm
- Database of opening moves
- Database of end games
- Move reordering to increase pruning
What game did minimax fail to win?
Go