L9: Monte Carlo Tree Search, Current AI, Timeline Overview Flashcards

1
Q

How does Monte Carlo Tree Search work?

A

Use sampling to evaluate game tree. Replace evaluation function with random playouts. Only partially explores game tree (balance exploration and exploitation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Steps of Monte Carlo Tree Search

A
  1. Leaf selection (find promising leaf, balancing exploration and exploitation)
  2. Leaf expansion (add new node to selected leaf)
  3. Simulation (random playout till game end- note if node already end of play then just skip to 4)
  4. Backpropagation (update node values based on simulation result- in examples update with +/-1 for each player if we won and don’t update q() at all if we lose-only n)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the formula for the Upper Confidence Bound for Trees (UCT)?

A

From node v select child u such that that
u = argmax per child w (q(w)/n(w) + C*sqrt(ln(n(v))/n(w))
First part is avg. score, second part is penalty for frequently visited nodes. So make c bigger to prioritise exploration and smaller to prioritise exploitation. C often = 2

If n(w) = 0 then just choose that child to prevent division by zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Can you do question on slide 60?

A

Yes?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the two move selection methods

A

Max child: Select child w of root v that maximises q(w)/n(w)

Robust child: Select child w of root v that maximises n(w)- child explored the most

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Name some possible MCTS extensions

A
  • Pruning heuristics
  • Non-uniform playouts
  • Replacing with evaluation function
  • Game simplification
  • Incorporating deep learning techniques
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe the developments in AI during the 1990s and 2000s

A

Increased computing power had several consequences:
* “Old” methods were finally getting the power they needed
For example Kasparov vs. Deep Blue
* Approximate expensive methods were becoming “good enough” (Monte Carlo methods in general, Heuristic optimisation methods (genetic algorithms, ant colony optimisation, etc))
* Existing methods were becoming practical enough to embed in applications.- “Expert systems” now just one of many functionalities embedded in MS Excel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What s the deep learning revolution?

A

Took place in 2010s.
* GPU-accelerated neural nets enable practical deep learning
* Specialised neural net topologies successful on certain tasks
* Convolutional neural nets for images and video
* LSTM networks for handwriting, speech, temporal data
* Initially fueled by availability of “big data”
* Combination with reinforcement learning breakthrough in dynamic setting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly