Week 2 - Search with A* & MCTS Flashcards

1
Q

Intelligent agents

A

Perceive environments and act.

Fully/partial observable
Deterministic/Stochastic
Episodic/Sequential
Static/dynamic
Discrete/continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Static agents vs Learning

A

Static have all info at start.
Learning learn.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Breadth First

A

Expand all nodes before moving on to next layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Uniform cost search

A

Expand node with lowest path cost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Depth first search

A

Always expand the deepest node.

Could lower exploration though.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Bidirectional Search

A

Search from both the initial and goal state, hoping that they meet in the middle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Informed (Heuristic) Search

A

Use an estimate for the cost of current state to the goal.

Eg a navigation heuristic from Loughborough to Edinburgh might be going north.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A* Search

A

Combines the cost to get to a node with estimated cost to goal.

f(n) = g(n) + h(n)
(g is cost from initial to current)
(h is estimate to goal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A* Search Conditions for optimality

A

h(n) must be an admissible heuristic (never overstimate)
Consistency: estimated cost cannot increase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When is each (tree/graph) version of A* optimal?

A

Tree - if h(n) admissible
Graph - if h(n) is consistent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A* search always expand the… (cost)

A

lowest overall cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

MIN MAX

A

MAX step tries to maximise score
MIN tries to minimise max’s score.

DOUBLE CHECK UNDERSTANDING
Max tries to maximise the worst case outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Alpha beta pruning

A

Pruning (avoiding) branches that will not be explored.

Faster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

MCTS

A

Monte Carlo Tree Search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

MCTS:

A

Explores the tree using a stochastic process (Monte Carlo - casinos) when the tree is too large to explore - even with pruning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

MCTS: Four main phases

A

Selection
Expansion
Simulation
Backpropagation

17
Q

MCTS: Selection Phase

A

Algorithm chooses a node to expand.

Upper Confidence Bound is used to balance exploration and exploitation
see slides for expression

18
Q

MCTS: Expansion Phase

A

Node is expanded by adding child nodes.

Can be guided by heuristic or random.

19
Q

MCTS: Simulation Phase

A

SImulate random game outcomes until terminal condition.

Allows to quickly reach the end of the game

20
Q

MCTS: Backpropagation

A

Result is used to update statistics for the tree “backwards”.

21
Q

Supervised Learning

A

Examples of correct behaviour provided.

22
Q

Unsupervised Learning

A

Data is analysed to extract properties

23
Q

Self-supervised Learning

A

Model learns to reconstruct part of the data so that correlations and strcutures are learnt.

24
Q

Reinforcement Learning

A

System learns by rewards

25
Q

Best first search

A

f(n) = h(n)