DCGAN - Deep Convolution GAN
INTRODUCTION:
In recent research, we have seen multiple CNN classification, detection and segmentation algorithms such as Inception versions, YOLO versions, FCN, U-net and KSAC.
All high accuracy model requires a large amount of data along with labelling. In the real world, data are usually in the unlabeled form. So, data gathering and labelling are really time-consuming and erroneous.
DCGAN can be helpful to learn intermediate features from the unlabeled data. The feature representation can be leveraged for further all types of supervised tasks.
One might know that GAN contains two types of architecture:
- Discriminator
- Generator
- Discriminators can be used for image classification tasks.
- Generators can be used to manipulate the semantic qualities of generated images.
DETAIL OF DCGAN:
Approach:
- use LAPGAN:
- It helps to upscale low resolution generated images.
- Replace max pooling with strides:
- Allow the network to learn its own spatial downsampling.
- eliminate fully connected layers with the global average pooling layer:
- Helps to increase stability but reduce the convergence speed.
- Use batch-normalization:
- Gets the benefit of stabilized learning
- Helps in poor weights initialization and gradient flow in deeper models.
Architecture:
Block Diagram:
Discriminator:
Generator:
The generator uses Laplacian GAN architecture to generate high-quality samples of natural images.
Training:
- mini-batch stochastic gradient descent.
- batch size = 128
- weight initialization: zero centred normal distribution with 0.02 standard deviation.
- slope of leaky relu = 0.2
- Adam optimzer
- The generator uses Laplacian GAN architecture to generate high-quality samples of natural images. The generator uses Laplacian GAN architecture to generate high-quality samples of natural images.learning rate = 0.0002
- B1 = 0.5
Adversarial Loss:
In Every GAN model, Adversarial loss is a must. Adversarial loss tries to match the distribution of generated images to dataset distribution.
Where D tries to maximize the correctness with the help of logD(x) and G tries to minimize the fake prediction with the help of log(1 -D(G(z))).
So, Adversarial loss is used in any GAN network. The loss function applies to the output of that discriminator.
Label Smoothing:
Label smoothing is a regularised technique that introduces noise of the labels which helps stabilize DCGAN training.
This accounts for the fact that datasets may have mistakes in them, so maximizing the likelihood of log P(y|x) directly can be harmful.
Assume for a small constant ϵ, the training set label y is correct with probability 1−ϵ and incorrect otherwise. Label Smoothing regularizes a model based on a binary classification by replacing the hard 0 and 1 classification targets with targets of ϵ−1 and 1−ϵ respectively.
CONCLUSION:
The Blog explained about basic understanding of DCGAN and architectures of discriminator and generator. The main focus is the stable network that can train any type of image generation with such specified classes. There are also architectural changes and training processing functionalities such as label smoothing and adversarial loss functions. It also mentioned basic training hyperparameters to start with.


Comments
Post a Comment