Week 5: Modern Computer Vision Applications 2 Flashcards
What is GAN in the context of image generation?
GAN stands for Generative Adversarial Network, a deep learning framework comprising two neural networks, a generator, and a discriminator, competing in a game-like scenario to produce realistic images.
What does GAN stand for in the context of pre-trained architectures on various datasets?
GAN stands for Generative Adversarial Network, which involves two neural networks, a generator and a discriminator, trained adversarially to produce synthetic data. Pre-trained GAN models are models that have been trained on large datasets and are capable of generating realistic images specific to those datasets.
What is the goal of generative models?
To generate realistic data, such as images, text, or audio.
What are some examples of generative models?
Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Deep Generative Models (DGMs).
What is the basic structure of a GAN?
It consists of two neural networks: a generator and a discriminator. The generator tries to create realistic data, while the discriminator tries to distinguish between real and fake data.
How are GANs trained?
The generator and discriminator are trained in an adversarial manner. The generator is updated to better fool the discriminator, and the discriminator is updated to better identify fake data.
What are some applications of image-to-image translation?
Photo editing, style transfer, medical imaging, and data augmentation.
What is pix2pix?
A conditional GAN architecture for paired image-to-image translation. It requires a training dataset of paired images (e.g., grayscale and color images).
What is CycleGAN?
A GAN architecture for unpaired image-to-image translation. It does not require paired images and instead relies on cycle consistency loss to ensure that the translated images are realistic.
What are some challenges with training GANs?
Unstable training, mode collapse, and difficulty in evaluating the quality of the generated data.