Unit 4: Graphical Models and Bayesian Networks Flashcards
Major problem with decision trees, SVMs and NNs
How to handle uncertainty and unobserved variables that require probabilistic reasoning.
E.g. how can we deal with any uncertainty in our input variables, or what confidence do we have in the output.
Undirected graphical models
A.k.a. Markov Random Fields (MRFs) or Markov Networks.
They represent the relationships between variables using undirected edges. The independence properties between the variables can be easily read off the graph through simple graph separation (missing edges / arcs).
Directed graphical models
A.k.a. Bayesian network or Belief network (BNs).
They are directed acyclic graphs (DAGs).
They consists of a set of nodes, together with a set of directed edges.
Each directed edge is a link from one node to another with the direction being important.
The edges must not form any loops.
Moralisation
A bayesian network can be converted into an MRF by connecting all the parents of a common child with edges and dropping the directions on the edges.
Maximum a posteriori query
(MAP query)
“What is the most likely explanation for some evidence?”
I.e. what is the value of the node X that maximises the probability that we would have seen this evidence.
“What value of x maximises P(x | e)?”
Bayesian networks
(a.k.a. Belief networks)
Provide a graphical representation of a domain that provides an intuitive way to model conditional independence dependencies and handle uncertainties.
Providing a compact specification of full joint probability distributions.
3 Components of Bayesian networks
- A finite set of variables.
- A set of directed edges between the nodes, that form an acyclic graph.
- A conditional probability distribution associated with each node P(Xᵢ | Parents( Xᵢ )). This is typically represented in a conditional probability table (CPT) indexed by values of the parent variables.
An uninstantiated node in a Bayesian Network
A node whose value is unknown.
Conditional independence
A and B are conditionally independent given C
if and only if
P( A⋂B | C ) = P(A|C) . P(B|C)
Denoted
( A ⟂ B ) | C
D-separation
Provides that:
If two variables, A & B, are d-separated given some evidence e, they are conditionally independent given the evidence.
P(A | B, e) = p(A | e)
Chain rule
The joint probability distribution is the product of all the individual probability distributions that are stored in the nodes of the graph.
P( X₁, …, Xₙ ) = ᴨ P( Xᵢ | Parents(Xᵢ) )
Parents of node Xᵢ
The nodes that have edges into Xᵢ.
Hybrid networks
Networks that contain both discrete and continuous variables.
Clique
A clique of a graph is a subset of nodes of the graph such that every two distinct vertices in the clique are adjacent.
Graph separation in an undirected graph
XA is conditional independant of XC given XB if there is no path from a ∈ A to c ∈ C after removing all variables in B.