Generative Adversarial Networks (GAN) is a term encountered often in the field of artificial intelligence, machine learning, and deep learning. What it is and what it does can be quite confusing when you first come across it. In this blog, we will finally demystify Generative Adversarial Networks. We’ll explain what they are and why they matter – or rather why they will matter to you and your business. Let’s discuss what is GAN.
What is a Generative Adversarial Network?
GANs are a class of machine learning algorithms that generate new data from a given training dataset. This post introduces conditional GANs, which can be used to generate new data instances of a specific class or label. It was introduced by Ian Goodfellow et al. in 2014, although the idea of adversarial training dates back to Jürgen Schmid Huber in 1992.
The basic idea of GANs is to train two neural networks simultaneously: a generator network and a discriminator network. Both generative and discriminative models have been used widely across computer vision tasks. Generative models have recently received increasing research interest due to their potential applications. However, generative models are usually challenging to train due to the problem of mode collapse, where generated samples lack diversity and tend to be of low quality. Both networks are trained at the same time, and they iteratively compete against each other by updating their parameters as follows:
1) Generator generates new examples using randomly sampled noise vectors $z$ as input, where $z\sim p(z)$, typically a normal distribution $\mathcal.
2) Discriminator evaluates how real the generated examples look to it and assigns probabilities to them accordingly.
3) Generator updates its parameters so that the discriminator makes mistakes more frequently when evaluating the example generated by it.
4) Discriminator updates its parameters so that it becomes better at distinguishing between real and fake examples.
The original GAN framework was proposed by Goodfellow et al.
Self-Attention Generative Adversarial Networks
Self-Attention Generative Adversarial Networks (SAGAN) is a Deep Learning approach for unsupervised image generation. NVIDIA researchers introduced it in May 2018, and it has seen tremendous success ever since its inception. The SAGAN paper was the top trending paper on arXiv for weeks after it was published, and for a good reason: it produces some of the most photorealistic images ever made by a machine.
What is Self-Attention?
Self-Attention is a mechanism that an agent can use to pay attention to certain parts of its memory or input space instead of having one fixed set of filters (like CNNs) or recurrence (like RNNs). This allows the model to understand its input space better — as long as it can figure out what information in its memory corresponds to parts of its input.
The SAGAN uses self-attention layers to help the generator look at different parts of itself while generating images. This allows the generator to develop an understanding of what part of the image it is looking at and how that part relates to other parts of the image.
Conditional Generative Adversarial Networks
Conditional Generative Adversarial Networks, or cans, extend the GAN architecture we’ve covered in this tutorial. Instead of training a model to generate data completely from noise, as is done in vanilla GANs, it can allow you to take some control over the data that your model generates. This is accomplished by training your GAN on pairs of images and conditioning both the generator and discriminator on this additional information.
Let’s apply cGANs to the MNIST dataset to demonstrate how this works. In MNIST, each image is a handwritten digit between 0 and 9. So we’ll condition our models on these digits and see if they can learn to generate images that correspond to them.
Generative Adversarial Networks VS Convolutional Neural Networks
Generative adversarial networks (GANs) are a cutting-edge deep learning model. A GAN can be used to generate new data that resembles the data it was trained on. We discuss the intuition behind this framework and some specific instances of GANs and provide some theoretical insights into why GANs might be expected to work well.
Convolutional neural networks are deep artificial neural networks that have successfully been applied to analyzing visual imagery. Its neurons are inspired by the organization of the animal visual cortex, whose individual neurons are arranged in such a manner that they respond to overlapping regions tiling the visual field.
Usage of GANs
They learn to produce new data with the same statistics as the training data. We can imagine them as a counterfeiter trying to produce fake currency and a detective trying to detect it. The counterfeiter is trying to get better at making fake bills, while the detective is simultaneously trying to get better at spotting them. Eventually, both of them will reach a point where it’s impossible to tell which is which.
GANs have applications in many fields:
- Image generation – Generating images of celebrity faces, anime characters, landscapes, etc.
- Video game design – Designing video games for humans to play against
- Semantic image manipulation – Creating photo-realistic edits on existing images (like adding a smile)
- Image super-resolution – Increasing the resolution of an image by generating missing details from lower resolutions versions
- text-to-image synthesis – Generating an image from a given text description
Advantages of GANs
GANs have many advantages:
– They can learn to create data that looks similar to data that we have. This allows us to generate new images, songs, or texts.
– They can learn to create images of high resolution.
– GANs enable unsupervised learning. In contrast with other deep learning approaches, GANs don’t require a human annotator.
Disadvantages of GANs
- GAN is primarily a generative model, which means that it generates new data from scratch.
- The main disadvantage of this model is that you can’t control what kind of data cannot be generated. This may result in some weird and nonsensical outputs by the generator.
- The other major disadvantage is that it is unstable.
- It requires a lot of training data for the discriminator and generator to learn and produce results.
- Training GANs take a lot more time than other models because of their instability in the learning process.
So the only thing that’s new about generative adversarial networks (GANs) is the twist—one neural network (the generator) tries to create images that have never been seen before, and they don’t fool another neural network (the discriminator). This new development makes GANs just a little bit better than their predecessors, and it opens up a lot of interesting possibilities for machine learning, with near-infinite potential!