Week 3 Flashcards
Key assumption of Naive bayes
Each effect only depends on cause
<=> effects don’t affect each other
Why is conditional independence assumed for naive bayes
Preserve linearity in number of effects for P table
If we don’t do this, P table grows exponentially as new effects are introduced
A bayesian network cant
Have any cycles
Graph of Bayesian Network is
Directed Acyclic Graph (DAG)
P of a selection of states of given variables
On a Bayesian network
Local semantics of a node in a Bayesian network
A node X is independent of its non-descendants given its parents
Markov Blanket
A node X is conditionally independent of all others given its Markov Blanket (parents, children, children’s parents)
How to compress Markov blankets further
Boolean functions (eg NorthAmerican <=> Canadian v US v Mexican) (prior knowledge)
Numerical relationships eg(image)
Simple queries
Conjunctive Queries
Sensitivity Analysis
Which P values are most critical
4 ways to compute posterior marginal
Enumeration
Rejection sampling
Likelihood weighting
Gibbs Sampling
Inference by enumeration: pro and con
Pro: deterministic
Con: inefficient
Variable elimination for enumeration
Evaluate enumeration tree bottom up
Time and space cost of variable elimination
Exact inference is
P complete
NP - Hard
NP Hard
Nondeterministic polynomial time hard
At least as hard as the hardest problems in NP. (The class of NP hard problems)
What is “# P”
P is the class of difficulty in counting the solutions
Related to NP
NP Hard is a class of times to find solutions
LLN
Why Rejection sampling over prior sampling
Prior sampling has no notion of conditioning
How does rejection sampling work
We do prior sampling and then reject those for which e doesn’t hold
Likelihood weighting
Summarise Gibbs sampling
Algorithm wanders randomly around state space… flipping one var at a time but keeping evidence variables fixed
Steps for Gibbs sampling
Begin with a query with evidence vars fixed to obs vals
Randomly initialise non-evidence vars
With entire state now set sample first non-evidence var, if this causes it to change value , update state and save
Then move to next non-evidence var
Repeat until sample size reached
Gibbs sampling pseudo code
Chain rule
Locally structured system
Each sub component interacts directly with only a bounded number of components
Leak node
If causes for an effect may not be included, leak node can generally represent ‘miscellaneous causes’
Nonparametric representation
(For continuous vars) Define conditional distribution implicitly with a collection of instances, each containing specific values (or ranges of) for all vars
Hybrid Bayesian network
Network with both discrete and continuous RVs
Difference between Probit and logit
Link function!
For Probit is CDF of standard N dist
For Logit is logistic function:
Total set of variables in Bayesian Network and what are they
X is query variable
E is evidence variables
Y is hidden variables
Enumeration algorithm pseudocode
Calculating α for posterior marginal ?
When you have a final vector, just find ratio of 2 to get probabilities of each
How to interpret CPT with random numbers
Eg 0.1 for A = True
Any random number less than 0.1 is True as this gives 10% chance of True
How to do prior sampling
With random number generator
Starting from top, moving from left to right on each row
If random number < P then True
How to rejection sampling
Same as prior sampling BUT
If conditions are not as in query, discard sample. Eg;
P( a | b, c) if sample has not b and/or not c, reject
How to do likelihood sampling
For P(T|h, s)
h and s are fixed evidence variable.
Therefore when going through sampling multiply weight by the probability that we are fixing
At end, each sample has values and a weight associated to the sample