AutoSDF Flashcards
AudoSDF - What is the problem of the encoder working on the whole 3D shape and not on patches?
Each latent vector is seeing the whole shape which interferes with shape completion. We want a way to match partial view of a shape with partial view of latent codes.
AudoSDF - What is the problem with transformers learning on 3D shapes?
The complexity of the attention mechanism goes up quadratically with regards to the dimension of the input.
AudoSDF - How the complexity problem is being delt with?
They use VQ-VAE to learn the discrete latent representation of each 3D shape. Then the transformers trained are on an input with less dimensions.
AudoSDF - What is the typical assumption of the ordering of the latent vectors? And how is it being used in relation to the distribution?
A raster scan ordering. Which then autoregressive models use to break down the distribution p(Z) = Π i=[1->d,1->d,1->d] p(zi|z<i).
AudoSDF - What is the problem they are trying to solve when doing a rastering scan distribution break down?
For shape completion we don’t want to restrict ourselves to complete only the last tokens from the beginning. Most of the times the tokens being ‘seen’ are in a random oder.
AudoSDF - How do they overcome the rastering scan order problem?
Assume that the joint distribution can broken down in terms of a random observable set of previous latent variables.
AudoSDF - How do they model the prediction problem?
The distribution of the latent variable in a random place i is modelled by a transformer given all the previous observed variables.
AudoSDF - What is the naive decomposition of P(Z|C)? (latent variable given a condition)
p(Z|C) = Πi p(zi |z<i , C)