Generative models Flashcards
What is density estimation?
Estimate a probability density function from discreete data
What can we do with generative models?
Draw new samples, colorization, super resoultion, simulation, creating latent represenation…
What is the main difference between autoencoders and variational autoencoders?
Autoencoders learn a latent represantation, while variational autoencoders learn a explit densitiy
What is the reparametrization trick in variational autoencoders?
If we sample directly from guassian with mean/std as calculated in the encoder, we can’t backprop the gradient to the encoder. Instead sample from a standard normal dist. And multiply/ add std. and mean.
In variational autoencoder how can we view the training data?
We can view the training data as beeing generated from a underlying distribution, where the latent vector z is a sample from the true prior p(z) and x is sample from the conditional true p(x|z)
Why can’t we directly maximize the data likelihood, p(x|theta) to get the parameters in variational autoencoders?
p_theta(x) = integ{ p_theta(x|z) * p_theta(z) dz is intractable.
So is the posterior: p_theta(z|x) = p_theta(x|z) * p_theta(z) / p_theta(x)
What kind of loss do we use for GAN’s?
Cross entropy (The loss is for the discriminator). Simplifies to:
min_G max_D = E_x[log D(x)]+ E_z [1 - log(D(G(z)))]
In practive we use min_G -E_z(log(D(G(z))] instead as the gradient is better for worse samples.
What is the goal for the GAN?
To reach a level where the discriminator can’t seperate the real from fake, meaning D(x) = 0.5
Why do we experience mode collapse in GAN’s?
While standard maximum likelihood methods are equivalent to minimizing the KL divergence:
KL(p, q) = integral p(x) log[p(x)/q(x)] dx, where p(x) is the true distrubution. This “loss” is low when q(x) is high at all modes of p(x), meaning q(x) will cover modes at the cost of quality.
GAN’s using log(D(G(X))) instead of log(1 - D(G(x))) instead results in a generator with a reverse KL term:
KL(p, q) = integral q(x) log[q(x)/p(x)] dx.
This loss penalizes bad quality, leading to mode collapse.
What is GAN arithmetic?
In GAN’s we can do arithmetic on the z samples.
For example we can average z samples leading to images of men with glasses, subtract men without glasses, add women without glasses to generate women with glasses.
What is a adversial autoencoder?
We add one discriminator network to a autoencoder and feed both fake z samples and real z samples. This forces the encoder to create latent variables close to the fake z distrubtion.
How do conditional GAN’s work?
As a normal GAN, except that we feed a class into the generator and draw images from only that class for the discriminator. This can help with multi modal learning.
What is image-to image translation?
For example feeding scetches into the generator expecting to get a realistic image out. The discriminator then determines if the scetch/image pair is real/fake.