Panik Flashcards
table-lookup agent
mapping indexed by percept sequences. Each action is just an entry in that table.
reflexive agent
condition-action rules that are pattern matched against a percept.
Goal-oriented agents
try to fulfil (binary) goals by imagining the outcome of actions.
Utility-based agents
have a utility function that judges the expected outcomes of different actions somehow.
Learning agents
improves the system by changing and retrieving knowledge from the performance element
accessible
All relevant aspects of the world are available to the sensors.
deterministic
The next state depends completely on the current state and chosen action.
episodic
The choice of an action depends only on the current state (not on the past).
static
The world does not change while deciding on an action.
discrete
There are only finitely many world states in a range.
successor function S ( x )
returns the set of states reachable by any action from state x.
path
sequence of actions
Single-state problem
Multiple-state problem
Contingency problem
Exploration problem
World knowledge | Action knowledge complete | complete incomplete | complete to be found at run-time | incomplete unknown | unknown
Completeness:
Always finds a solution if it exists.
(blind) search
uninformed
(heuristic) search
informed
Iterative deepening
combines BFS and DFS. It executes depth-limited search at increasing depths until
a solution is found.
Non-complete search?
Iterative Deepening
The heuristic h used for A* must be admissible, that is
h ( n ) ≤ h* ( n ) must hold where h* ( n ) is the cost of an optimal path from n to the nearest goal.
To compare the quality of admissible heuristics
we introduce the effective branching factor b*
Iterative Deepening A*
a variant of Iterative Deepening search that explores branches up to a given threshold for the value of f ( n ). If this threshold is passed and no solution was found the threshold is set to the minimal value of f ( n ) for all found nodes n that exceeded the threshold.
Simplified Memory-Bounded A*
uses a bounded priority queue. New nodes are added to the queue. Each time a node is added, the algorithm checks whether all sibling nodes are in the queue, then
the parent is removed. If the memory is full, the node with the highest cost is removed from the queue and its parent is added (if not already present).
Hill Climbing
iteratively expands the highest-valued successor of the current node.
Games are
special cases of search problems. States are usually accessible. Actions are possible moves by a player.
Knowledge Level:
What is known by the knowledge base.
Symbolic Level:
The encoding of the knowledge base in a formal language.
Implementation Level:
Internal representation of sentences, like lists or strings of things.
Deductive Inference
a process to compute the logical consequences of a KB, i.e. given a KB and a sentence alpha we want to compute if KB |= alpha.
correct
the computation is correct, e.g. if we compute KB |= a then KB |= a.
complete
for every sentence a such that KB |= a we can also compute that KB |= a.
Skolemisation is not equivalence preserving but
satisfiability preserving.
The Most General Unifier (MGU) can be calculated using the following algorithm in
exponential time.
Planning
Given a set of actions, an initial and a goal state, find a plan to reach the initial state from the goal state. Such a plan will consist of an arrangement of (possibly only partially) ordered actions.
Planning VS Search
uses more detailed information e.g. preconditions and not just edges
complete plan
requires that for any step every precondition to be fulfilled by some predecessor and that no step between the fulfilment and the requirement of a condition undoes this.
consistent plan
requires that no two actions take place at the same time.
solution
a plan that is both complete and consistent
Partially ordered plans
preorder on steps which means it defines a must-happen-before-relationship
d-Separation time
polynomial
d-Separation complete?
no
supervised learning
the learner has access to input and correct output.
Reinforcement learning
only gives feedback in terms of rewards and punishment, but not correct answers.
Unsupervised learning
happens without any feedback to the learner. The learner must learn on its own.
Feed-forward networks
directed acyclic graph
Recurrent networks
arbitrarily complex connections