Week 2 - Search with A* & MCTS Flashcards
Intelligent agents
Perceive environments and act.
Fully/partial observable
Deterministic/Stochastic
Episodic/Sequential
Static/dynamic
Discrete/continuous
Static agents vs Learning
Static have all info at start.
Learning learn.
Breadth First
Expand all nodes before moving on to next layer
Uniform cost search
Expand node with lowest path cost.
Depth first search
Always expand the deepest node.
Could lower exploration though.
Bidirectional Search
Search from both the initial and goal state, hoping that they meet in the middle.
Informed (Heuristic) Search
Use an estimate for the cost of current state to the goal.
Eg a navigation heuristic from Loughborough to Edinburgh might be going north.
A* Search
Combines the cost to get to a node with estimated cost to goal.
f(n) = g(n) + h(n)
(g is cost from initial to current)
(h is estimate to goal)
A* Search Conditions for optimality
h(n) must be an admissible heuristic (never overstimate)
Consistency: estimated cost cannot increase
When is each (tree/graph) version of A* optimal?
Tree - if h(n) admissible
Graph - if h(n) is consistent
A* search always expand the… (cost)
lowest overall cost
MIN MAX
MAX step tries to maximise score
MIN tries to minimise max’s score.
DOUBLE CHECK UNDERSTANDING
Max tries to maximise the worst case outcome
Alpha beta pruning
Pruning (avoiding) branches that will not be explored.
Faster
MCTS
Monte Carlo Tree Search
MCTS:
Explores the tree using a stochastic process (Monte Carlo - casinos) when the tree is too large to explore - even with pruning.