DCGAN & StyleGAN

July 12, 2025 by Anand Singh

अब हम GANs की दुनिया के दो सबसे शक्तिशाली और लोकप्रिय versions की ओर बढ़ते हैं —
🌀 DCGAN (Deep Convolutional GAN) और 🎨 StyleGAN

ये दोनों GAN architectures image generation में breakthrough साबित हुए हैं।

🔶 1. DCGAN (Deep Convolutional GAN)

📌 परिचय:

DCGAN एक Convolutional आधारित GAN architecture है जिसे 2015 में Radford, Metz, and Chintala ने प्रस्तावित किया था।

🎯 “यह पहला scalable और stable GAN architecture था जो high-quality images generate कर सका।”

🧠 Key Features:

विशेषता	विवरण
📦 Conv Layers	Generator और Discriminator दोनों में Convolutional layers
🧹 No Pooling	Pooling की जगह stride और transposed conv
🔍 BatchNorm	Stability और convergence के लिए
📈 ReLU & LeakyReLU	Activation functions
🔥 Simplicity	Simple architecture + Amazing results

🧱 DCGAN Architecture:

Generator:

Input: Random noise vector (z)
→ Fully connected layer  
→ Transposed Conv + BatchNorm + ReLU  
→ Transposed Conv + BatchNorm + ReLU  
→ Transposed Conv + Tanh  
→ Output: Fake image

Discriminator:

Input: Image (real/fake)
→ Conv + BatchNorm + LeakyReLU  
→ Conv + BatchNorm + LeakyReLU  
→ Flatten  
→ Fully connected layer + Sigmoid  
→ Output: Real/Fake Probability

🔧 PyTorch Library Support:

nn.ConvTranspose2d   # For upsampling (Generator)
nn.Conv2d            # For downsampling (Discriminator)
nn.BatchNorm2d       # For stability
nn.Tanh / nn.LeakyReLU

🧪 Applications:

Handwritten digits (MNIST)
Anime faces, bedrooms, shoes
Prototype generation for design

🔷 2. StyleGAN (Style-based GAN)

📌 परिचय:

StyleGAN को NVIDIA ने 2018 में Introduce किया था (Karras et al.).
यह अब तक का सबसे realistic face generator माना जाता है।

🎯 “This Person Does Not Exist” जैसी websites StyleGAN पर आधारित हैं।

🧠 Key Features:

विशेषता	विवरण
🎨 Style-based architecture	Noise vector को style vectors में बदलना
🧬 Progressive Growing	Low-res से high-res तक धीरे-धीरे training
🧠 AdaIN	Adaptive Instance Normalization for style control
🌈 Latent Space Control	Face features को tune करना (e.g., smile, age)
📸 High-Res	1024×1024 तक की photo-quality image generation

🧱 StyleGAN Architecture (Simplified)

Input: Random vector z  
→ Mapping Network → Style vector w  
→ Synthesis Network  
    → Starts from constant image  
    → Multiple conv blocks  
    → AdaIN modulation  
→ Output: High-quality image

🔁 Style Mixing:

StyleGAN अलग-अलग layers पर अलग-अलग styles apply करके
face blending और feature control करता है।

Layer	Controls
Early layers	Pose, layout
Mid layers	Facial features
Late layers	Skin tone, hair texture, color

🧪 Applications:

क्षेत्र	उपयोग
🎭 Face Generation	Hyper-realistic faces
🖼️ Art & Design	Style morphing
🎮 Game Dev	Character creation
🎥 Movie FX	Virtual avatars
🔬 Biology	Synthetic cell generation

🔍 DCGAN vs StyleGAN

Feature	DCGAN	StyleGAN
Year	2015	2018
Architecture	CNN-based	Style-based
Output Quality	Good	Ultra-Realistic
Control	None	High (Style mixing)
Latent Vector	Directly used	Transformed via Mapping
Applications	Simple image gen	Human faces, art, avatars
Training	Stable	Complex, resource-heavy

📝 Practice Questions:

DCGAN में Generator और Discriminator कैसे काम करते हैं?
StyleGAN में style control कैसे किया जाता है?
DCGAN और StyleGAN में क्या फर्क है?
AdaIN क्या होता है और क्यों जरूरी है?
Progressive Growing क्या है?

📌 Summary Table

Model	Use	Key Feature
DCGAN	Simple image generation	CNN-based Generator
StyleGAN	High-resolution faces, art	Style control, AdaIN, mixing