L9: Monte Carlo Tree Search, Current AI, Timeline Overview Flashcards
How does Monte Carlo Tree Search work?
Use sampling to evaluate game tree. Replace evaluation function with random playouts. Only partially explores game tree (balance exploration and exploitation)
Steps of Monte Carlo Tree Search
- Leaf selection (find promising leaf, balancing exploration and exploitation)
- Leaf expansion (add new node to selected leaf)
- Simulation (random playout till game end- note if node already end of play then just skip to 4)
- Backpropagation (update node values based on simulation result- in examples update with +/-1 for each player if we won and don’t update q() at all if we lose-only n)
What is the formula for the Upper Confidence Bound for Trees (UCT)?
From node v select child u such that that
u = argmax per child w (q(w)/n(w) + C*sqrt(ln(n(v))/n(w))
First part is avg. score, second part is penalty for frequently visited nodes. So make c bigger to prioritise exploration and smaller to prioritise exploitation. C often = 2
If n(w) = 0 then just choose that child to prevent division by zero
Can you do question on slide 60?
Yes?
Describe the two move selection methods
Max child: Select child w of root v that maximises q(w)/n(w)
Robust child: Select child w of root v that maximises n(w)- child explored the most
Name some possible MCTS extensions
- Pruning heuristics
- Non-uniform playouts
- Replacing with evaluation function
- Game simplification
- Incorporating deep learning techniques
Describe the developments in AI during the 1990s and 2000s
Increased computing power had several consequences:
* “Old” methods were finally getting the power they needed
For example Kasparov vs. Deep Blue
* Approximate expensive methods were becoming “good enough” (Monte Carlo methods in general, Heuristic optimisation methods (genetic algorithms, ant colony optimisation, etc))
* Existing methods were becoming practical enough to embed in applications.- “Expert systems” now just one of many functionalities embedded in MS Excel
What s the deep learning revolution?
Took place in 2010s.
* GPU-accelerated neural nets enable practical deep learning
* Specialised neural net topologies successful on certain tasks
* Convolutional neural nets for images and video
* LSTM networks for handwriting, speech, temporal data
* Initially fueled by availability of “big data”
* Combination with reinforcement learning breakthrough in dynamic setting