Artificial Intelligence Flashcards
In terms of predicate calculus what is unification?
- The inference system must be able to determine when two expressions are the same or match
- Two expressions match if, and only if, they are syntactically identical
- unification is an algorithm of determining the substitution list to make two predicate expressions match
- if p and q are logical expressions then unify(p,q) gives a substitution list (S) that makes p and q identical or fail
What are the core elements of a Artificial neural Network (ANN)?
- emulation of biological neural system
- consisting of nodes (artificial neurone) and weights (neural connection)
- each node has weighted connection to several other nodes in adjacent layers
- each node takes input received from the connected nodes and uses the weight together with a simple function to compute output values
- knowledge is stored in the connections between neurone
What are some problems with A*
- Doesnt adapt to changing environments
- doesnt plan around other AI objects
- speed
What dies language consist of in logical programming?
- Syntax = tells us how to build up sentences (^, V etc)
- Semantics = tells us meaning of sentences; assigning a truth value to each sentence;
What is a constant?
name objects or properties in the real world
What are Bayesian belief networks?
- Nodes = random variables
- Directed arcs = direct casual relationships, or conditional dependencies (X has a direct influence on Y)
- If there is no arc between nodes, these variables are assumed to be conditionally independent
- Captures both qualitative (which variables depend on others) and quantitative (probability values) relationships
What is logic programming?
- you present facts and rules to infer new facts just by asking questions
- when you ask a question, the run time system searches through the database of facts and rules to determine the answer
What is needed for an action to occur?
- Action helps us change state
- precondition
- effects
- chosen set of actions need to be both optimal and valid
what is the probability of an event in a sample space
P(E) = |E| / |S|
What is A* from a more technical standpoint?
- f(x) = g(x) + h(x)
- g(x) = cost from start node x, calculated in same way as Dijkstra’s
- h(x) = Heuristic function for node x = some estimate of the min cost to get to the destination from x
- open node with lowest f(x) is selected at each step
How do we train a ANN?
- find correct weight values
- supervised training = show the network the right asnwer
- apply a leaning method that will modify the weights
How do ANN’s work?
Have to train the Neuron -Intro data -Compute An output -compare output to desired output -weights are modified to reduce error Use the NN -intro new data to the network
What is a proof?
- A sequence of sentences where each sentence is either a premise or a sentence derived from earlier sentences in the proof by one of the inference rules
- the last sentence is the goal that we want to prove
How do we stop AI from colliding in games?
- When programming games with multiple AI units, possible to have them colliding through a bottleneck point of the shortest path
- can map onto a 3D grid by marking AI routes as obstacles for another AI, add a wait function, move from layer to layer at each step
How is Baye’s rule often described?
- P(A|B) = 1/c(P(B|A)P(A))
- Before equals sign is the posterior probability
- 1/c is normalisation factor
- P(B|A) is likelihood
- P(A) is prior probability
- A is event and B is evidence
Pros and cons of single layer network?
Pro -simple and easy to implement -fast learning Con -can only learn linearly separable functions
What is Baye’s rule?
Bayes’ rule; P(A|B) = (P(B|A)*P(A))/P(B)
- Expresses the relation between a conditional probability and its inverse
- Can also be written as; P(A|B) = 1/c(P(B|A)P(A))
What do we need for a Dijkstra’s algorithm implementation?
- set of nodes and pair costs to traverse from one to its neighbours
- start and end nodes
- open and closed nodes lists
- link to the neighbour which got us there for that cost
What are some characteristics of MLNN?
- overcome limitations of linear perception
- can learn non linear relationships
- can solve different problems
- stock prediction
- number of layers
- one hidden layer is usually enough
- learning
- more complex learning methods, can be very slow, require lots of training data to learn
How do single layer networks work?
- an adder sums up all the inputs modified by their respective weights (weight at pos x * item at pos x) + (weight at pos x+1 * item at pos x+1) … etc
- referee to as linear combination
- an activation function controls the amplitude of the output of the neuron
- acceptable range of output is 0-1 or -1-1
How do we choose between breadth and depth first search?
- in games we often want to look a fixed number of moves ahead, and look for the best outcomes = breadth first
- if we want to consider all outcomes, depth search usually uses less memory and is slightly easier to implement
- there are multiple ways to approach tree search, and choice often depends on what we already know about the search tree and what we want to achieve
Advantages of a ANN?
- good for real time application as massively parallel processing
- self trainable, learns by itself, needs training data
- excellent when it comes to generalisation and dealing with uncertain info
- works well with noisy data
- can perform tasks that a linear program can not
- does not need to be reprogrammed because it learns
What does an inference procedure consist of in logical programming?
-tells us which sentences are valid inferences from other sentences
What is a function?
used to evaluate statements (eg; plus(2,3) evaulatues to 5)
In predicate calculus what is substitution?
- assigning values to variables
- two terms unify if there is a substitution that makes those two terms identical
- eg; unifying f(x,2) and f(3,y) can be written 3/X, 2/Y
- the notation X/Y indicates that X is substituted for the variable Y in the original expression
what is a predicate?
- name relationship between objects in the world
- they are special functions with true/false as values
- predicates with the same name but different arguments are considered distinct
What are predicate calculus symbols?
- A….Z, a…z
- 0-9
- underscore
- any combination but must start with a letter
What does logic programming consist of?
Language and rules
What is an elementary/atomic event
a happening or occurrence that cannot be made up of other events
What is proof procedure?
-method of proving statements using inference rules
Step 1 = translate above statements into propositional logic
Step 2 = write down the premises and the goal into propositional logic
Step 3 = write a formal proof, a sequence of steps that state premises or apply inference rules to the previous step
What is A*?
- standard for games path finding, shown to be optimal
- builds on Dijkstra by changing it from shortest path first to best first search
- search more likely paths first
- instead of choosing the node with the lowest cost from the start, select those which would be most direct (shortest) first, given what has been encountered so far
- adds a Heuristic function to the open node costs when choosing which node to close next
when are two events independent and when are they dependent?
- two events are independent if the fact that event A occurs does not affect the probability of B occurring
- two events are dependent if the fact that A occurs affects the probability of B occurring
What is a variable?
used to assign general classes or properties of objects in the real world
What is AI planning?
- Define initial state
- define goal state
- define action = define preconditions and effects
- solving a graph search problem
- the sequence of actions that go from initial state to goal state
Explain why Dijkstra’s is a special case of A*?
A* = Dijkstra’s if H = 0
What the mathematical model for an ANN?
- I(n) - inputs
- Wn - weights
- Net - summation
- Theta - bias/threshold, can be implemented as extra input
- f - activation function
- O - output
How do we initialise Dijkstra’s algorithm?
- set total costs of all nodes to infinity (or huge number)
- set total cost of start node to 0
- Doesn’t matter about links
- put all nodes on to the open lists
How do we test a ANN?
- if success and you are satisfied with result you are done
- if failure, then go back to previous steps and try different values of parameters or maybe a different topology
what is an event
a set of elementary events
What are the key ideas for Dijkstra’s algorithm?
- planning = works out complete route in advance
- divide and conquer = decompose problem into iterative process
- eliminate suboptimal search directions early in search process
- explore all possible paths = not just first we find
- might need to explore dead ends
- want it to work on a general graph
- remember best route found so far
What are knowledge based systems?
- Computer systems which generate and utilise knowledge from different sources, data and information
- aid in solving problems by utilising AI
- can make decisions based on the knowledge residing in them and can understand the context of the data that is being processed
What steps are taken when designing a ANN?
Set up the architecture;
-number of I/O
-which of input features to take
-how many samples to take
-number of hidden layers
-number of neurones, too many require more training time
-learning rate, from experience value should be small
Tune/optimise internal parameters by presenting learning data set to ANN
What is probability theory
One way to represent uncertainty, allows;
- The use of varying degrees of belief to represent uncertainty
- to assign beliefs to relations between propositions
- to assign beliefs to inferences
Disadvantages of ANN?
- Requires training
- requires high processing for large neural networks
- do not explain the results they obtained
- processing time can rise quickly as the size of the problem grows
- overtraining can be a problem
What are multilayered neural networks?
-feedforward neural networks with one or more hidden layers
Consists of
-an input layer of source neurones
-at least one middle or hidden layer of computational neurons
-an output layer of computational neurones
what is probabilistic reasoning?
- Probabilistic reasoning formalises the process of accumulating evidence and updating probabilities based on new evidence
- Prior probability of A:P(A) – belief before the new evidence B
- Posterior probability of A: P(A|B) – belief after the new evidence B
How can we make A* faster?
-can pre-calculate paths to make it faster
-Hierarchical (HPA);
>break down problem
>split into sections and find transaction points between sections
>pre-compute intersection routes
>at run time: insert start and end points, connect to local transitions, then apply to A node graph
-partial precomputed, can patch, rebuild section at run time, faster, nearly as optimal
-can also use symmetry
What are time and space complexity measured in terms of?
b: Max branching factor of the graph
d: depth of the least-cost solution
m: max depth of the graph
Explain how, on each iteration of Dijkstra’s algorithm, a node is selected to be closed. How are it and its neighbours then updated?
- The lowest cost node on the open list is selected.
- and is moved to the closed list
- Neighbours are updated as the nodes total cost + pair cost
- If not closed already and traversable and new cost is lower, then the total cost is updated
Suggest suitable pair costs for a grid-based world that has uniform terrain?
- 1 for up-down-left-right
- 1.4 for diagonals (root of 2)
In predicate calculus what can symbols represent?
- Constants (must begin with lowercase letter) = car, tree, blue etc (can’t use true/false)
- Variables (must begin with an uppercase letter) = used to assign general classes of objects or properties
- Functions (must begin with lowercase letter) = father(David), replacing a function with its value is called evaluation
- Predicates (must begin with lowercase letter) = names a relationship between objects in the world. Eg; likes(George, Susan) and likes (George, Susan, James) NOTE: these are considered different completely
Difference between depth and breadth first in terms of structure?
- depth first uses a stack
- breadth uses a queue
- breadth takes more memory
- both have some complexity; O(V+E)
When would we use a ANN?
when the problem is
-data rich
-non linear multidimensional, input/output mapping
When we have enough time to design to final ANN
What is classical AI?
-concerns itself with attempting to explicitly represent human knowledge in a declarative form (facts/rules)
What is connectionist AI?
-computing elements resemble an abstraction of our own neural circuitry
What complicates unification? How do we solve this?
-variables
-we can replace variables with;
> other variables
> Constants
> function expressions
what can belief networks be used for?
- Belief networks can be used to reason
- Forward (top-down) from causes at the top to effects at the bottom – predictive reasoning
- Backward (bottom-up) from effects to causes – diagnostic reasoning
what dimensions are AI algorithms evaluated on?
- Completeness = does it always find a solution if one exists
- time complexity = number of nodes generated/expanded
- Space complexity = max number of nodes in memory
- Optimality = does it always find a least-cost solution
What is an inference system?
determines when two expressions match
What are expert systems?
- Represented first truly large scale commercialisation of AI tech
- computer programs that have an expert knowledge in specialised domains
- good at tackling specific types of problem
What are the characteristics of a biological neurone?
- neurone has a cell body, a branching input structure (dendrite) and a breaching output structure (axon)
- axons connect to dendrites via synapses
- only fires if its input signal exceeded a certain amount (a threshold) in a certain time period
- synapses vary in strength, good connections allow a large signal and vise versa
Some applications of ANN?
- Pattern recognition
- clustering
- optimisation
- control
- medical applications
- business applications
What should games AI be?
- Believable
- fast
- use low resources
Limitations of Dijkstra?
- calculate path first then start moving
- needs to work out entire solution in advance
- doesnt adapt to changes in environment
- doesnt adapt to changes in the target position
- doesnt take into account anything else moving in the environment
What is special about actions?
- They have preconditions and effect
- they can be applicable only if their preconditions are fulfilled
- actions lead to new effects
- preconditions and effects need to be defined clearly in PDDL by predicate logic
What are intermediate nodes?
- remember the best route found so far for intermediate nodes and use it to reject any possible candidate paths which get there in more steps
- might find better path segments as we go along
- when there are no possible better ways to get to that intermediate node, we know that the best solution found so far is the actual best solution
- use that knowledge to refine best routes to its neighbours
what are some types of heuristic functions
Manhattan distance -Total difference in X + total difference in Y -suitable for 4-connect grid Diagonal distance -Max (difference in X, difference in Y) -suitable for 8 connected grid
What are terms?
they are mapped to objects in their domain (constant, variables and functions)
What is AI?
- Thinking humanly; need a way to understand how humans think and get in their minds
- Acting humanly
- Thinking rationally; notation and rules of derivation for thoughts, AI hopes to create intelligent systems using logic programming
- Acting rationally; Doing the right thing, rational agent is one that achieves best outcome
What is probability distribution?
listing of all probabilities from every possible value of a single random variable
what are preconditions?
Conditions that must happen for an action to be applied
eg; A predicate for a flight would be ‘get to the airport ‘
What can a heuristic function be?
- any function which estimates the distance from a node to the target
- we want to also make sure we still get the shortest path
- Admissible; should not overestimate the shortest distance from a node to the destination (should be <= shortest distance)
- if a function is not admissible it is not guaranteed to get the shortest path
what is the sample space
the set of all possible outcomes of an event
What do we do on each iteration of Dijkstra?
- find node n on the open list with the lowest current estimated cost
- move it to closed list
- re-estimate each open neighbour (cost of N + pair cost)
- if new neighbour estimate is lower, update node cost and set link to N
- when N == end node, we are finished
- follow links back to get full path
What is an effect?
new state after action is applied
-eg; arrived at detination
what are uncertainties in the real world
Perception - robot may be confused whether an object is a mop or sheepdog
Actions - don’t always lead to the intended effects, after applying a put-down action an object may not always be on the table
Prior knowledge - We know we can get from A-B but the route may be blocked