Exam 3 Flashcards
Chapters 4 and 5
What is the point of a local search algorithm?
To find the final state and not the path itself to get there (such as the 8-queens problem). Only the final configuration matters
What is a state-space landscape?
It’s the state-space shown as a landscape function, with each point (state) having it’s elevation which is defined by the value of the objective function
What is hill-climbing search?
It’s a local search algorithm that continuously moves on to neighboring states with the highest value until it reaches a peak (no neighbor has a higher value)
What is a complete-state formulation?
Every state has all the components of a solution but they might not all be in the right place
What is a greedy local search?
A local search algorithm that grabs a good neighbor state without thinking ahead about where to go next (sometimes hill climbing is greedy)
What is stochastic hill climbing?
Hill climbing that chooses at random from among the uphill moves, depending on the probably of selection which varies with the steepness of the uphill move
What is first-choice hill climbing?
Stochastic hill climbing that generates successors randomly until one is generated that is better than the current state
When would you use first-choice hill climbing?
When a state has many (thousands) of successors
What is random-restart hill climbing?
It conducts a series of hill-climbing searches from randomly generated initial states until a goal is found
Does the success of hill climbing depend on the shape of the state-space landscape?
Yes, a good solution is found quickly depending on the landscape and the algorithm used
What is simulated annealing?
It uses the concept of gradient descend (minimizing cost), starting at a high temperature then goes to the lowest temperature possible
What is local beam search?
It keeps track of k states rather than just one, and allows all parallel search threads to communicate information
What is stochastic beam search?
Local beam search that instead of choosing the top k successors, it chooses successors with a probability proportional to the successor’s value
What are evolutionary algorithms?
Can be seen as variants of stochastic beam search, inspired by biological natural selection; There is a population of individuals (states) in which the fittest (highest value) individuals produce offspring (successor states) that populate the next generation
What is the process that evolutionary algorithms take called?
Recombination
What is a genetic algorithm?
Each individual is a string over a finite alphabet, in which all individuals are also recombined
What makes up evolutionary algorithms?
Population size, individual representation, mixing number (p), selection process, recombination, and mutation rate
What is constrained optimization?
Solutions having to satisfy some hard constraints on the values of variables
What do we assume when we use search algorithms?
A fully observable, deterministic, known environment
What is a belief state?
A set of physical states that the agent believes are possible, usually used in not fully observable and/or nondeterministic environments
What does a conditional plan do?
Specifies what to do depending on what percepts the agent receives while executing the plan
What does an AND-OR search tree consist of?
OR nodes (agent does this or that action), and AND nodes (agent needs to find plan for 2 or more states at the same time)
What is a cyclic solution?
It keeps trying a certain action until it works when that action isn’t guaranteed to execute
What is a sensorless problem?
The agent percept’s provide no information at all
What is localization?
An agent working out where it is, given a map of it’s environment and a sequence of percepts and actions
What is an offline search algorithm?
Agents that compute a complete solution before taking their first action
What is an online search algorithm?
Agents that interleave computation and action; first it takes an action, then observes the environment, then computes the next action
What are online search algorithms best for?
In dynamic/semi-dynamic environments where an agent should not compute for too long, as well as nondeterministic domains
What is the mapping problem?
A robot is placed in an unknown building and must explore to build a map that can later be used for getting from A to B, usually an example of online search
What is the competitive ratio?
The comparison between the estimated cost of the total path and the optimal path
What is the adversary argument?
The adversary constructing the state space while the agent explores it, with the adversary placed goals and dead ends wherever it chooses
What makes a state space safely explorable?
Some goal state is reachable from every reachable state
What is a random walk?
The agent simply selects at random one of the available actions from the current state, with preference to unexplored actions being possible
What is optimism under uncertainty?
The assumption that an unexplored state will lead immediately to the goal with the least possible cost
What is a competitive environment?
Two or more agents having conflicting goals, which creates adversarial problems
How does an economy stance work in multi-agent environments?
The aggregate of a very large number of agents; increasing demand will cause prices to rise in general, don’t have to predict the action of individuals
How does pruning work?
It “cuts off” (ignores) portions of the search tree that make no difference to the optimal move, saves time
What is an evaluation function?
It estimates who is winning based on features of the state
What is considered a game with imperfect information?
Games where not all possible information is known, such as card hands as in UNO or Poker
What does zero-sum mean?
What is good for one player is just as bad for the other; no win-win outcome
What does a game consist of?
Initial state (s0), To-Move(s), Actions(s), Result(s,a) (the transition model), Is-Terminal(s) (the terminal test), and Utility(s,p)
What is a complete game tree?
A search tree that follows every sequence of moves all the way to a terminal state
What is minimax search?
An algorithm used on a two-ply (two players) game tree, alternates between moves taken by each player, with each state having a minimax value (usually depending on the max player); can be thought of as working backwards from the terminal states to the root
What is the minimax decision?
The optimal choice for max that leads to the state with the highest minimax value
Describe the minimax search properties
Time complexity: O(b^M); space complexity O(bm)
What is alpha-beta pruning?
Similar to minimax, but implements pruning of decisions that wouldn’t affect the optimal route; saves time and memory
What do the alpha and beta values mean in minimax?
alpha is the value for the best choice for the MAX player and beta is the best choice for the MIN player
What does the effectiveness of alpha-beta pruning depend on?
The order in which the states are examined
What are transpositions?
Different permutations of the move sequence that end up in the same position
What is the Type A strategy?
Considering all possible moves to a certain depth in the search tree, and then using a heuristic evaluation function to estimate the utility of states at that depth (explores wide but shallow portion)
What is the Type B strategy?
Ignores moves that look bad and follows promising lines “as far as possible” (explores deep but narrow portion)
What is the cutoff test?
A test that returns true for terminal tests, otherwise it is free to decide when to cut off the search
How does the evaluation function vary?
For terminal states, Eval(s,p) = Utility(s,p); For nonterminal states, Utility(loss,p) <= Eval(s,p) <= Utility(win,p)
What do most evaluation functions work with?
Features of the states, which are the characteristics of the game
How does a weighted linear function work?
Eval(s) = w1f1(s) + … + wnfn(s); each feature is multiplied by their weight (how important a feature is)
What is a quiescent position?
Positions in which there is no pending move that would wildly swing the evaluation
What is the horizon effect?
When an opponent’s move will cause serious damage and is ultimately unavoidable but can be temporarily avoided by using delaying tactics
What are singular extensions?
Moves that are “clearly better” than all other moves in a given position, even when the search would normally be cut off at that point
What is forward pruning?
Pruning that prunes moves that appear to be poor moves but could be possibly be good ones (is a Type B strategy)
What is the late move reduction?
It reduces the depth to which we search for good moves, which can save time; based off the alpha value and decides if it re-runs the search with full depth
What is the Monte Carlo tree search?
It works by using the value of a state that is estimated as the average utility over a number of simulations of complete games starting from the state
What is the playout policy?
A bias for the good moves
What is the pure Monte Carlo search?
Do N simulations starting from the current state of the game, and track which of the possible moves from the current position has the highest win percentage
What is a selection policy?
It focuses on selecting the computational resources on the important parts of the game tree, balanced on explorations of states with few playouts and exploitation of states that have had well past playouts
What is early playout termination?
Stopping a playout that is taking too many moves; evaluate it with a heuristic evaluation function or just declare it a draw
What are stochastic games?
Games with a bit of unpredictability caused by a random element (such as throwing dice)
What are chance nodes?
Often shown as circles as trees, denote all possible outcomes of an action that’s based off chance (all possible dice rolls)
What is the expected value of a position?
The average over all possible outcomes of the chance nodes
What is the expectiminimax value?
A generalization of the minimax value for deterministic games