अब हम GANs की दुनिया के दो सबसे शक्तिशाली और लोकप्रिय versions की ओर बढ़ते हैं —
🌀 DCGAN (Deep Convolutional GAN) और 🎨 StyleGAN
ये दोनों GAN architectures image generation में breakthrough साबित हुए हैं।
🔶 1. DCGAN (Deep Convolutional GAN)
📌 परिचय:
DCGAN एक Convolutional आधारित GAN architecture है जिसे 2015 में Radford, Metz, and Chintala ने प्रस्तावित किया था।
🎯 “यह पहला scalable और stable GAN architecture था जो high-quality images generate कर सका।”
🧠 Key Features:
| विशेषता | विवरण |
|---|---|
| 📦 Conv Layers | Generator और Discriminator दोनों में Convolutional layers |
| 🧹 No Pooling | Pooling की जगह stride और transposed conv |
| 🔍 BatchNorm | Stability और convergence के लिए |
| 📈 ReLU & LeakyReLU | Activation functions |
| 🔥 Simplicity | Simple architecture + Amazing results |
🧱 DCGAN Architecture:
Generator:
Input: Random noise vector (z)
→ Fully connected layer
→ Transposed Conv + BatchNorm + ReLU
→ Transposed Conv + BatchNorm + ReLU
→ Transposed Conv + Tanh
→ Output: Fake image
Discriminator:
Input: Image (real/fake)
→ Conv + BatchNorm + LeakyReLU
→ Conv + BatchNorm + LeakyReLU
→ Flatten
→ Fully connected layer + Sigmoid
→ Output: Real/Fake Probability
🔧 PyTorch Library Support:
nn.ConvTranspose2d # For upsampling (Generator)
nn.Conv2d # For downsampling (Discriminator)
nn.BatchNorm2d # For stability
nn.Tanh / nn.LeakyReLU
🧪 Applications:
- Handwritten digits (MNIST)
- Anime faces, bedrooms, shoes
- Prototype generation for design
🔷 2. StyleGAN (Style-based GAN)
📌 परिचय:
StyleGAN को NVIDIA ने 2018 में Introduce किया था (Karras et al.).
यह अब तक का सबसे realistic face generator माना जाता है।
🎯 “This Person Does Not Exist” जैसी websites StyleGAN पर आधारित हैं।
🧠 Key Features:
| विशेषता | विवरण |
|---|---|
| 🎨 Style-based architecture | Noise vector को style vectors में बदलना |
| 🧬 Progressive Growing | Low-res से high-res तक धीरे-धीरे training |
| 🧠 AdaIN | Adaptive Instance Normalization for style control |
| 🌈 Latent Space Control | Face features को tune करना (e.g., smile, age) |
| 📸 High-Res | 1024×1024 तक की photo-quality image generation |
🧱 StyleGAN Architecture (Simplified)
Input: Random vector z
→ Mapping Network → Style vector w
→ Synthesis Network
→ Starts from constant image
→ Multiple conv blocks
→ AdaIN modulation
→ Output: High-quality image
🔁 Style Mixing:
StyleGAN अलग-अलग layers पर अलग-अलग styles apply करके
face blending और feature control करता है।
| Layer | Controls |
|---|---|
| Early layers | Pose, layout |
| Mid layers | Facial features |
| Late layers | Skin tone, hair texture, color |
🧪 Applications:
| क्षेत्र | उपयोग |
|---|---|
| 🎭 Face Generation | Hyper-realistic faces |
| 🖼️ Art & Design | Style morphing |
| 🎮 Game Dev | Character creation |
| 🎥 Movie FX | Virtual avatars |
| 🔬 Biology | Synthetic cell generation |
🔍 DCGAN vs StyleGAN
| Feature | DCGAN | StyleGAN |
|---|---|---|
| Year | 2015 | 2018 |
| Architecture | CNN-based | Style-based |
| Output Quality | Good | Ultra-Realistic |
| Control | None | High (Style mixing) |
| Latent Vector | Directly used | Transformed via Mapping |
| Applications | Simple image gen | Human faces, art, avatars |
| Training | Stable | Complex, resource-heavy |
📝 Practice Questions:
- DCGAN में Generator और Discriminator कैसे काम करते हैं?
- StyleGAN में style control कैसे किया जाता है?
- DCGAN और StyleGAN में क्या फर्क है?
- AdaIN क्या होता है और क्यों जरूरी है?
- Progressive Growing क्या है?
📌 Summary Table
| Model | Use | Key Feature |
|---|---|---|
| DCGAN | Simple image generation | CNN-based Generator |
| StyleGAN | High-resolution faces, art | Style control, AdaIN, mixing |