Lect 21 and 22 Flashcards
Learning a graph structure vs learning a latent variable
Learning a graph structure:
There are multiple graphs
Find which graph working the best according to some criterion
Fit training set and fit validation set
Iterative search to find a graph which is really similar to the best graph, with small changes such as:
removing/adding/flipping edges
Latent Variables Use just one graph structure There ar elots of latent variables Each latent variable is connected to the visible or observed variable as in a dense layer.(each one is connected to each other) Learn with gradient descend
Shallow Restricted Boltzman
there are latent are observes(visible variables) connected densely to each other.
Our goal is to find a PROBABILISTIC INFERENCE and if we can estimate the joing probability then we can find a probabilistic inference.
To do this we use a joint probability distribution :
P(v|h) and P(h|v)
We want to find them in order to start the process of learning.
P(v=v h=h) = 1/Z exp{-E()}
where the energy function is :
E() = -b^Tv - c^T h - v^t W h
In order to find P(h|v) = P(h|v) / p(v) finding p(v) is not possible. 1/P(v). Z * (exp (b^T v + c^T h + v^T W h)) because the observed variable v is constant we can remove it from the equation.
= 1/Z’ (exp TOPLAM (cj^T hj) + TOPLAM v^TWhj
=1/Z’ ÇARPIM exp (cj^T hj +v^TWhj )