State Space Search Flashcards
state space search
represented using a search tree or search graph it is about finding a sequence of actions to lead from an initial state/position to a goal state without knowing how to get there
state
representation of the problem at a given point in time
actions
set of all possible moves
transition model
describes result of applying action to a sate given the environment (rules)
state-space
math representation of all possible states that a system can be in with actions to go from one to another
completeness
always finding a solution if one exists
optimality
finding the least costly solution
time complexity in terms of big 0
nodes generated/expanded
how run time grows as a function of input
space complexity in terms of big 0
max nodes in memory
how much memory required grows as function of input
uninformed search
single agent
systematically exploring state space without use of knowledge to guide the agent until finds goal state
inefficient unintelligent
uninformed search has the following features
initial state
transition model (rules)
set of possible actions
goal test
node in state space search has
represents a state in the search tree incl extra info like action that led to i, parent node, depth, costs)
path vs step cost
path is total cost of total cost associated with a particular sequence of actions
step cost is cost incurred moving from one state to another or taking a single action
how is search strategy determined?
dependant on order of nodes in fringe are expanded
if search algorithm picks first node determined by
how new nodes are added
if search algorithm picks last node determined by
which node picked to expand
4 types of tree searches
Breadth first
depth first
death limited
iterative depth
BFS, what, complete, optimal and space/time
expand nodes in order of adding them to the fringe
work across all in one depth before going down
yes will complete if there is a solution
not optimal in general
space and time expo
DFS, what, complete, optimal and space/time
explores nodes as far as possible along a branch before backtracking via stack
complete if finite depth not complete if in inf depth
not optimal
space linear
time expo
DLS what, complete, optimal and space/time
DFS with limited depth
prevents inf loop
complete if depth is enough to find solution
not optimal
space linear
time expo
IDS, what, complete, optimal
combines space efficiency of DFS with completeness of BFS
complete
not optimal in general
space linear
time expo
tree vs graph search
tree explores hirearchical structures with single root, simple but doesn’t;t check for repeated states so wastes time and space
graph searches explore connections accounting for cycles, keeps closed list of repeated states expanded so far so only adds new node is state is not found on list but takes additional memory to track visited states
informed search
single agent
systematically exploring state space with use of knowledge to prioritise node expansion on those that are more likely to lead it to the goal state
informed search features
initial state
set of possible actions
transition model (rules)
goal test (evaluation function to evaluate with respect to reaching goal)
informed search picks the node to expand with the
best evaluation function
3 types of uninformed search
uniform cost search
greedy search
A*
uniform cost search, what, how, optimal, complete
variant of BFS prioritising nodes based on their path cost
expands nodes in increasing order (lowest first) and only looks back to cost of getting so far
finds optimal route
complete if action cost >0 but not in terms of space and time complexity
explores lots of possibilities and gives the optimal route
greedy search, what, how, optimal, complete
only looks forward to goal state
chooses node to expand with lowest heuristic
won’t give optimal route like last time but finds the goal state a lot quicker
not optimal as might prioritise nodes that lead to dead ends
heuristic, what and how used
evaluate how promising a prticular state or action is in reaching the goal
aim to reduce the search space by prioritizing certain paths.
A*, what complete, optimal
uses actual accumulated cost so far and heuristic to estimate cost to reach the goal node
complete unless inf nodes
optimal if TS admissible heuristic and GS consistent heuristic
adversial search
multi-agent
finds optimal strategy for one player while anticipating opponents move
models consequences of an action by modelling what action the other play may take
essential in 0 sum games (when one player gains the other loses resulting in 0 utility)
adversial search features
initial state
possible actions
transition model
terminal test
utility function (agent’s subjective value of terminal state)
agent chooses move to max its utility
what is the algorithm used in adversial search and how does it work?
minimax
search tree until reaches terminal state
note utility of each term state (+1,1,0)
back up values of nodes to parents selecting max value on MAX’s move and min value on MIN’s move
continue backwards until get values for children of root node
admissible vs consistent heuristic
admissible never over estimates cost of path from current to goal state (can underestimate but never overestimate)
consistent estimate is always less than or equal to the estimated distance from any neighbouring vertex to the goal, plus the cost of reaching that neighbour
every consistent h is ____ but
admissible but not all admissible are consistent
optimality of A* using heuristics TS vs GS
TS: when uses AD h it will find least cost path to goal
GS: when uses consistent h avoids redundant explorations
2 common heuristics
1) straight line distance which is both ad and consistent
2) Manhattan distance in tile game is # of moves needed to get from one pt to to the other without diagonal moves
depth limited minimax
same but just search until depth limit is reached, applying heuristic evaluation funcition to state of each leaf
alpha beta pruning
optimisation technique used in minimax algorithm for eliminating branches that don’t need to be explored
what values ab pruning holds?
alpha (α): The best score that the maximizing player (Max) can guarantee so far. Initially, it’s set to a very low value (negative infinity).
Beta (β): The best score that the minimizing player (Min) can guarantee so far. Initially, it’s set to a very high value (positive infinity).
how ab pruning works
start with initialised a=- inf and b= +inf
DFS of game tree
for each maximiser node update a with max among children
for each minimiser node update a with min among children
pruning during traversal if at any point a>=b prune remaining branches of tree as won’t be looked at
why is ab pruning helpful?
increases efficiency of algorithm as prunes redundant branches and increases as depth increases
stochastic search
search algorithm that uses randomness/probability for decision making and doesn’t care about going from A-B just wants to find the best state according to some criteria
inspired by genetics and natural selection
features of stochastic search
state space
fitness function
stochastic search pros and cons
searches large search space
not guaranteed to find global optimum
genetic algorithm
stochastic search of biological evolution over a state space
GA state space
space of genetic variation in population
GA fitness function
evaluates how close solution is to optimal solution
GA chromosome
representation of a state in GA, encoded as #s often (DNA sequences) relates characteristic of individuals
GA crossover
genetic operation combining 2 parent chromosome to crehtaeone new one
mutation
randomly modify child’s chromosomes
population
set of chromosomes at a given iteration of the algorithm
diversity
variety of solutions within the population
high diversity avoids premature convergence to local optima
elitism
retaining fittest individuals for next generation
7 steps for GA
create pop of soutions
evaluate how fit each solution is
randomly select 2 parents but in a way fitter states are more likely to be picked
combine parents into offspring
mutuatiion, randomly modifying offspring to maintain diversity
replace old population w new
repeat process until stop criteria is met
epoch
the one entire passing of training data through the algorithm
2 parent selection options
roulette wheel selection
tournament selection
roulette wheel selection
stochastic selection
each individual is given slice of wheel proportional to its fitness and wheel is spun
tournament selection
select a sub group of individuals for a tournament and fitness there is selected
RW vs TS pros/cons
RW which lech prefers may lead to rapid convergence if one individual is significantly fitter than most but good for preserving diversity
TS promotes fittest while still trying to give others a chance
more greedy
3 types of point crossovers
single point, everything before point is from one parent and second half from other
k point has multiple of those points
uniform is all individual gene picked from each parent
when would each of these types of state space searches be viable?
state space is not too large
transition model (rules known) for (un/in/adv)
fitness function can be formed that ranks attest and sensible cross over point (GA)