Week 2 - Search with A* & MCTS Flashcards by Josh Davidson

Intelligent agents

Perceive environments and act.

Fully/partial observable
Deterministic/Stochastic
Episodic/Sequential
Static/dynamic
Discrete/continuous

How well did you know this?

Not at all

Perfectly

Static agents vs Learning

Static have all info at start.
Learning learn.

How well did you know this?

Not at all

Perfectly

Breadth First

Expand all nodes before moving on to next layer

How well did you know this?

Not at all

Perfectly

Uniform cost search

Expand node with lowest path cost.

How well did you know this?

Not at all

Perfectly

Depth first search

Always expand the deepest node.

Could lower exploration though.

How well did you know this?

Not at all

Perfectly

Bidirectional Search

Search from both the initial and goal state, hoping that they meet in the middle.

How well did you know this?

Not at all

Perfectly

Informed (Heuristic) Search

Use an estimate for the cost of current state to the goal.

Eg a navigation heuristic from Loughborough to Edinburgh might be going north.

How well did you know this?

Not at all

Perfectly

A* Search

Combines the cost to get to a node with estimated cost to goal.

f(n) = g(n) + h(n)
(g is cost from initial to current)
(h is estimate to goal)

How well did you know this?

Not at all

Perfectly

A* Search Conditions for optimality

h(n) must be an admissible heuristic (never overstimate)
Consistency: estimated cost cannot increase

How well did you know this?

Not at all

Perfectly

When is each (tree/graph) version of A* optimal?

Tree - if h(n) admissible
Graph - if h(n) is consistent

How well did you know this?

Not at all

Perfectly

A* search always expand the… (cost)

lowest overall cost

How well did you know this?

Not at all

Perfectly

MIN MAX

MAX step tries to maximise score
MIN tries to minimise max’s score.

DOUBLE CHECK UNDERSTANDING
Max tries to maximise the worst case outcome

How well did you know this?

Not at all

Perfectly

Alpha beta pruning

Pruning (avoiding) branches that will not be explored.

Faster

How well did you know this?

Not at all

Perfectly

MCTS

Monte Carlo Tree Search

How well did you know this?

Not at all

Perfectly

MCTS:

Explores the tree using a stochastic process (Monte Carlo - casinos) when the tree is too large to explore - even with pruning.

How well did you know this?

Not at all

Perfectly

MCTS: Four main phases

Study These Flashcards

Selection
Expansion
Simulation
Backpropagation

MCTS: Selection Phase

Study These Flashcards

Algorithm chooses a node to expand.

Upper Confidence Bound is used to balance exploration and exploitation
see slides for expression

MCTS: Expansion Phase

Study These Flashcards

Node is expanded by adding child nodes.

Can be guided by heuristic or random.

MCTS: Simulation Phase

Study These Flashcards

SImulate random game outcomes until terminal condition.

Allows to quickly reach the end of the game

MCTS: Backpropagation

Study These Flashcards

Result is used to update statistics for the tree “backwards”.

Supervised Learning

Study These Flashcards

Examples of correct behaviour provided.

Unsupervised Learning

Study These Flashcards

Data is analysed to extract properties

Self-supervised Learning

Study These Flashcards

Model learns to reconstruct part of the data so that correlations and strcutures are learnt.

Reinforcement Learning

Study These Flashcards

System learns by rewards

Best first search

f(n) = h(n)

Week 2 - Search with A* & MCTS Flashcards

(25 cards)