Expert Systems Flashcards

Question 1

Q

Expert System

Answer

A

interact with user to collect facts and help with a decision process

an expert system is a computer system that can make decisions that normally only experts can make

subtype of intelligent system

eg: dendral to explain organic molecules or mycin to advise bacterial therapy to physician (give confidence interval)
usages: medical, crime, …

Can be checked for corectness

Question 2

Q

Expert system vs Intelligent System vs Decision Support System

Answer

A

Vs IS
An expert system is only software. It’s not
embedded in the real world. So you could
say it’s a subcategory of intelligent systems

Vs Decision Support System
● An expert system combines knowledge with
reasoning, and makes the decision for
you by using the information you give it. Could explain decision to user.

● A decision support system helps you
process data (data analytics dashboard,
etc), and helps you make your own
decision

Question 3

Q

Components of an Expert System

Answer

A

● User makes query, gets sent to an inference
engine which interacts with the knowledge
base (composed of rules and facts
determined in the knowledge engineering
process)
● User receives back advice and perhaps an
explanation about why that particular advice
was given

Interface: for example chat-boxes or a question list.
-Knowledge Base: built by experts combined with data
scientists (or something related)
The KB and inference engine use rules (if-then statements) and facts (data about specific cases/instances) to
inference. Different knowledge representations are description logics, ontologies or non-monotonic logics

Question 4

Q

Inference Methods

Answer

A

Inference Methods that may be used:
If else statements (eg: myecin)
● Bayes rule and naive bayes
● Graphical models
● Factor Graphs
● Markov Network
● Bayesian Network

in general, exact inference is NP-complete. But we can do for example
marginalizing → determine the node that you want to have the probability for and
them you sum over all values that the node could possibly have (include outside
factors that influence the probability of the node).
-variable elimination
-loopy belief

Question 5

Q

Knowledge Representations

Answer

A

Description, ontologies

Question 6

Q

Advantages

Answer

A

● Consistency. It will make the same decision
given the same data
● Memory. It won’t forget a rule or
fact…humans forget stuff all the time.
● Logic. No sentimentality that clouds its
decision making.
● Infinitely reproducible. Doesn’t get tired, can
be copy-pasted to other places

Question 7

Q

Disadvantages

Answer

A

● Common sense is a problem. It’s hard to
program common sense
● No creativity (it can’t find new solutions to
new problems)
● Hard to maintain. Knowledge base, with
potentially 1000s of facts, needs to be
updated manually (this is the real killer)

Question 8

Q

Probabilistic Reasoning

Answer

A

when you have uncertainty and you want to model that uncertainty (as part of the
model). Example: facts are P(A=student) = 0.6, P(B=heads) = 0.5. Then reasoning: P(A|B) = x.

Question 9

Q

Bayes Rule

Answer

A

: P(A|B) = P(B|A) ∙ P(A)/P(B) . In general, there is too much data needed to estimate a target variable.
Example: assume some lab test for a disease has 98% chance of giving positive result if the disease is present,
and 97% chance of giving a negative result if the disease is absent. Assume 0.8% of the population has this
disease. Given a positive result, what is the probability that the disease is present?
P(disease|Pos) = P(POS│DIS) ∙ P(DIS)
P(POS) =
0.98 ∙0.008/(0.98 ∙ 0.0008+0.03 ∙0.992)

Question 10

Q

Naive Bayes

Answer

A

approximation of Bayes’ rule. It assumes that given the value of the class, all the attributes are
independent (conditional independence). It’s much more feasible than Bayes’ rule, but makes extreme
assumptions.
Therefore we have two extremes: either we guess the joint probability distribution (which would yield optimal
classifier, but is infeasible in practice) or we use Naïve Bayes (which is much more feasible, but makes too
strong assumptions). We want something in between. –> factor graphs

Question 11

Q

Marginalizing vs Maximization

Answer

A

marginalizing means finding a proba value
P(V=..)

Maximization finds the most probable value V = argmax P(V=..) -> can be used with Bayes Rules to classify (whats the most probable value target/class)

Question 12

Q

Factor Graphs

Answer

A

graphs that group random variables into conditionally independent ‘cliques’.
bipartite graph representing the factorization of a function.

A factor graph is a type of probabilistic graphical model. A factor graph has two types of nodes:

Variables, which can be either evidence variables when their value is known, or query variables when their value should be predicted.

Factors, which define the relationships between variables in the graph. Each factor can be connected to many variables and comes with a factor function to define the relationship between these variables. For example, if a factor node is connected to two variables nodes A and B, a possible factor function could be imply(A,B), meaning that if the random variable A takes value 1, then so must the random variable B. Each factor function has a weight associated with it, which describes how much influence the factor has on its variables in relative terms.

Question 13

Q

Markov Models

Answer

A

an undirected model that uses a non-directed graph where the cliques are fully connected subgraphs.

Question 14

Q

Bayesian Network

Answer

A

k: a directed model that uses a directed graph and conditional
probability tables of a class given its parents. As a factor graph

Question 15

Q

Marginalising through variable elimination

Answer

A

Variable elimination: an exact inference method that determines Px(Y|Z) by getting rid of all random variables
x in X\Y by multiplying all factors in which x appears and:
- Filling in the observed values of x if x is in Z
- Summing out over all possible values of x if x is not in Z.

Question 16

Q

Loopy belief Propagation marginalising

Answer

Study These Flashcards

A

communicate potential values to one another and these are used to make the marginals
message passing algorithm with damped updates. ‘If convergence’ and marginal of
X is the product of converged messages sent to X.

Question 17

Q

Sota adding probabilities to logic

Answer

Study These Flashcards

A

Statistical relational AI: StaRAI
Distribution semantics: ProbLog
Causal probabilities: CP-logic (making things a bit simpler).

Question 18

Q

Distribution Semantics

Answer

Study These Flashcards

A

Graph approach to find proba of joint events
uses probabilistic predicates and logical rules. Basically first-order logic (logic with
variables) where ProbLog adds probabilities to logic rules (“probability that this predicate is true”). It assumes
marginal independence between ground atoms.

In ProbLog, to determine if there is an edge: draw a random number 0

Question 19

Q

CP Logic: Causal Probabilites

Answer

Study These Flashcards

A

joint probabilities can be hard to estimate by human experts, we’re much better at
prediction causal probabilities

Expert Systems Flashcards

(19 cards)