Minimax and a-b pruning Flashcards
Minimax value
=best Max payoff against best Min play
Minimax value properties
It is a property of a state
- Directly proportional to Max’s utility
- indirectly proportional to Min’s utility
- each state has its own minimax value
- max chooses action that brings state to successor state with the HIGHEST minimax value
-min chooses the action that bring the state to the successor state with the LOWEST minimax value
minimax algorithm
recursively performs a complete depth-first exploration throughout the entire tree and assigns minimax values to each successor state of the state the minimax algorithm was called with
- The one given in class finds the max successor value and as part of the algorithm returns the action corresponding with that value
- at the very end of recursion it returns the Utility for a terminal test.
- completely unrealistic, but serves as a base for practical algorithms
minimax algorithm move assignments
- for each Max move, the Max value of the successor states is returned, as that is the best move for Max
- for each Min move, the minimum value of the successor states is returned as that is the best move for Min.
- all these moves keep getting passed up through higher levels of recursion to figure out the best possible move for each player at each point.
minimax properties
Completeness: yes
optimal: yes
time: O(b^m)
space: O(bm)
a-b pruning
(means alpha beta pruning)
not all nodes need to be explored to take the best decision
-prune away non-useful subtrees
-when expanding look at successors that are more likely to be best first
a-b properties
- does not affect final result
- good move ordering improves pruning effectiveness
- with “perfect ordering,” time complexity = O(b^(m/2))
- -doubles the depth of the search
- –demonstrates value of figuring out relevant computation
what is alpha?
a is the value of the best (i.e., highest value) choice found so far at any choice point along the path for max
- if v is worse (lower value) than a, max will avoid it
- -prune that branch
what is beta?
b is the value of the best(i.e., lowest value) choice found so far at any choice point along the path for min
- if v is a higher value than b, min will avoid it
- -prune that branch
resource limits for a-b
enforce a depth limit
replace utility with a heuristic evaluation function
-estimated desirability of position
-to be applied in states that are non-terminal(since were cutting off early)
evaluation functions for a-b
- require lighter lighter computation
- for nodes, directly connected to winning chances
- usually defined in terms of features
- -i.e. chess: # of pawns, queens etc.
- for chess, typically linear but recently some nonlinear weighted some of features
- -nonlinear: weight of bishops increases later in game