VQ-VAE Flashcards
VQ-VAE - what are the 3 main contributions of VQ-VAE in comparison to VAE
1 Quantisation of the latent space
2 Restrict the latent space to a linear combination of a set of vectors (codebook).
3 The prior of the codebook is learned rather than static
VAE - what are the 4 main principles that VAE is based on
1 AE framework with continuous latents
2 A Gaussian sampling procedure
3 The prior of the latent vector follows a Gaussian distribution with u=0 and sigma=1
4 a reconstruction loss and a KL divergence loss
VQ-VAE - how does it changes the AE framework
There’s a quantisation step in the middle which convert the continuous feature extraction output to a quantised entity
VQ-VAE - How does it changes the sampling procedure?
It assumes that the prior is a uniform distribution over the codebook and the posterior of the decoder input is a delta function that gives back the nearest codeword
VQ-VAE - what happens to the KL divergence
Becomes constant and is removed from the learning
VQ-VAE - What happens to the prior
It is learned in the training process
VQ-VAE - what is the loss?
The reconstruction loss, the codebook loss and the commitment loss
VQ-VAE - What is the codebook loss?
MSE from the features of the input to the codeword (putting stop-gradient on the input features)
VQ-VAE - What is the commitment loss?
The squared distance between the encoder to the stop-gradient of its closest codeword.