AI Archives - Page 10 of 17

Applications — Denoising & Dimensionality Reduction

July 11, 2025 by Anand Singh

अब हम Autoencoders के दो प्रायोगिक उपयोगों (applications) को विस्तार से समझेंगे —
🔹 Denoising
🔹 Dimensionality Reduction

ये दोनों real-world problems में बहुत उपयोगी हैं और deep learning की शक्ति को बख़ूबी दर्शाते हैं।

🔶 1. Application 1: Denoising Autoencoder

❓ What is it?

Denoising Autoencoder (DAE) एक ऐसा Autoencoder है जो noisy input को clean output में बदलना सीखता है।

🎯 “Input को जानबूझकर corrupt किया जाता है और model को सिखाया जाता है कि वह clean version reconstruct करे।”

📦 Working:

   Original Image (x)
         ↓ Add Noise
     Noisy Input (x̃)
         ↓
     [Encoder + Decoder]
         ↓
   Clean Output (x̂)

Model learns to minimize:

🔧 Example in PyTorch:

def add_noise(imgs, noise_factor=0.3):
    noisy_imgs = imgs + noise_factor * torch.randn_like(imgs)
    return torch.clip(noisy_imgs, 0., 1.)

You then train autoencoder with (noisy_img, original_img) pairs.

🧠 Use Cases:

Use	Description
🖼️ Image Denoising	Remove noise from pictures
📄 Document Cleanup	Clean scanned papers
📢 Audio Denoising	Remove background noise
🧠 Medical	Remove sensor noise in ECG, MRI, etc.

🔶 2. Application 2: Dimensionality Reduction

❓ What is it?

Autoencoder compresses high-dimensional data into a low-dimensional latent representation, similar to PCA (Principal Component Analysis) — but with non-linear capabilities.

🎯 “Autoencoder = Non-linear, trainable PCA”

📦 Example:

🔧 PyTorch Sketch:

# Encoder output is just 2D
self.encoder = nn.Sequential(
    nn.Linear(784, 128),
    nn.ReLU(),
    nn.Linear(128, 2)  # 2D Latent Space
)

🧠 Use Cases:

Use	Description
📊 Data Visualization	Compress to 2D for t-SNE / plots
🔍 Clustering	Group similar inputs (e.g., digits, faces)
⚡ Fast Inference	Work on lower-dimensional features
📈 Feature Extraction	Use compressed codes for ML models
🎮 Game AI	Compress game states

📝 Practice Questions:

Denoising Autoencoder क्या है और कैसे काम करता है?
Noise हटाने के लिए Autoencoder को कैसे train किया जाता है?
Dimensionality reduction में Autoencoder और PCA में क्या अंतर है?
Latent space का क्या role है?
Low-dimensional representation किन real-world problems में काम आता है?

📌 Summary

Application	Input	Output	Benefit
Denoising	Noisy image	Clean image	Noise removal
Dimensionality Reduction	High-dim data	Low-dim features	Visualization, clustering, compression

Variational Autoencoders (VAE)

July 11, 2025 by Anand Singh

अब हम Autoencoders की एक शक्तिशाली और probabilistic form को समझेंगे —
🔮 Variational Autoencoders (VAE)
जो deep generative models की दुनिया में एक foundation की तरह माने जाते हैं।

🔶 1. What is a Variational Autoencoder?

VAE एक तरह का Autoencoder है, जो ना केवल input को compress करता है,
बल्कि उसे एक probability distribution में encode करता है।

🎯 “VAE compress करता है input को एक distribution के रूप में, जिससे हम new data generate कर सकते हैं।”

🧠 2. Traditional Autoencoder vs VAE

Feature	Autoencoder	Variational Autoencoder
Output	Reconstruct input	Reconstruct + Sample new
Latent Vector	Fixed values	Probability distribution
Generation	No	Yes (Generative model)
Learning	Deterministic	Probabilistic
Regularization	None	KL Divergence

🔬 3. VAE Structure

Diagram:

        Input x
           ↓
       [Encoder Network]
           ↓
  Latent Distribution (μ, σ)
           ↓
     Sampling (z ~ N(μ, σ))
           ↓
       [Decoder Network]
           ↓
     Reconstructed x̂

🧮 4. Latent Distribution

VAE encoder predicts:

Mean vector μ
Log-variance log⁡σ2

From these, we sample latent variable

👉 इसे कहते हैं reparameterization trick
ताकि gradient backpropagation संभव हो।

🧮 5. VAE Loss Function

VAE का कुल loss दो हिस्सों से मिलकर बनता है:

L=Reconstruction Loss+ KL Divergence Loss

✅ Reconstruction Loss:

✅ KL Divergence Loss:

🔧 6. PyTorch Sketch

class VAE(nn.Module):
    def __init__(self, input_dim, hidden_dim, latent_dim):
        super(VAE, self).__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        self.mu = nn.Linear(hidden_dim, latent_dim)
        self.log_var = nn.Linear(hidden_dim, latent_dim)
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, input_dim),
            nn.Sigmoid()
        )

    def encode(self, x):
        h = F.relu(self.fc1(x))
        return self.mu(h), self.log_var(h)

    def reparameterize(self, mu, log_var):
        std = torch.exp(0.5 * log_var)
        eps = torch.randn_like(std)
        return mu + eps * std

    def forward(self, x):
        mu, log_var = self.encode(x)
        z = self.reparameterize(mu, log_var)
        return self.decoder(z), mu, log_var

📊 7. Applications of VAE

Application	Description
✅ Image Generation	New samples create करना
✅ Data Imputation	Missing values भरना
✅ Representation Learning	Compressed features
✅ Anomaly Detection	Rare patterns पकड़ना
✅ Drug Design	Molecule generation

🎨 8. Generated Sample Example (MNIST)

Train VAE on MNIST, then sample new digits by:

z = torch.randn(1, latent_dim)
generated = model.decoder(z)

👉 ऐसा output real handwritten digit जैसा दिखेगा, despite not being in the original dataset.

📝 Practice Questions:

VAE क्या है और Autoencoder से कैसे अलग है?
Latent vector के लिए reparameterization क्यों ज़रूरी है?
VAE का loss function किन भागों में बँटा होता है?
KL divergence का क्या उद्देश्य होता है?
VAE से synthetic data कैसे generate किया जाता है?

🎯 Summary

Feature	Description
VAE	Probabilistic autoencoder
Encoder Output	μ और σ (mean & std)
Sampling	z = μ + σ * ε
Loss	Reconstruction + KL divergence
Use Cases	Generation, anomaly detection, compression

Encoder-Decoder Structure

July 11, 2025 by Anand Singh

अब हम Deep Learning की सबसे शक्तिशाली और बहुप्रयुक्त संरचना को विस्तार से समझते हैं —
🔁 Encoder-Decoder Structure
जिसका उपयोग NLP, Image Captioning, Machine Translation, Autoencoders आदि में बड़े पैमाने पर किया जाता है।

🔶 1. What is the Encoder-Decoder Architecture?

Encoder-Decoder एक ऐसा framework है जिसमें model दो मुख्य भागों में बँटा होता है:

Encoder: Input data को एक compact और meaningful representation (context vector या latent vector) में बदलता है।
Decoder: उसी compact representation से नया output sequence या data generate करता है।

🎯 “Encoder compress करता है, Decoder expand करता है।”

🧱 2. Structural Flow

 Input Sequence / Data
                  ↓
              [Encoder]
                  ↓
           Latent Representation
                  ↓
              [Decoder]
                  ↓
            Output Sequence / Data

🔄 3. Encoder-Decoder is a General Pattern

Use Case	Input	Output	Encoder-Decoder
Translation	English sentence	French sentence	✅
Image Captioning	Image features	Text sentence	✅
Autoencoder	Image	Reconstructed image	✅
Chatbot	User query	Response	✅
Speech-to-text	Audio	Text	✅

🔧 4. Components of Encoder-Decoder

🔹 Encoder:

Sequence of layers (CNNs, RNNs, Transformers, etc.)
Learns to encode features from input
Outputs context/latent vector: h=f(x)

🔹 Decoder:

Takes the latent vector as input
Generates output step-by-step (esp. in sequence models)
Uses:

🧠 5. Why Use Encoder-Decoder?

Advantage	Description
✅ Generalizable	Works for images, text, audio
✅ Flexible	Input/output length may differ
✅ Modular	Encoder & Decoder can be designed separately
✅ Reusability	Encoder can be shared across tasks

🧪 6. Variants of Encoder-Decoder

Type	Example	Domain
CNN-CNN	Autoencoders	Vision
CNN-RNN	Image Captioning	Vision + NLP
RNN-RNN	Machine Translation	NLP
Transformer-Transformer	BERT, T5	NLP
ViT-GPT	BLIP, Flamingo	Vision+Language

🔧 PyTorch Skeleton Example

class Encoder(nn.Module):
    def __init__(self, input_dim, hidden_dim):
        super().__init__()
        self.linear = nn.Linear(input_dim, hidden_dim)

    def forward(self, x):
        return self.linear(x)

class Decoder(nn.Module):
    def __init__(self, hidden_dim, output_dim):
        super().__init__()
        self.linear = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        return self.linear(x)

# Sample use
encoder = Encoder(784, 128)
decoder = Decoder(128, 784)

x = torch.randn(1, 784)
latent = encoder(x)
output = decoder(latent)

📝 Practice Questions:

Encoder-Decoder structure क्या होता है?
इसका प्रयोग किन किन क्षेत्रों में होता है?
Encoder और Decoder के कार्य में क्या अंतर है?
Autoencoder और Sequence-to-Sequence में ये संरचना कैसे लागू होती है?
Encoder-Decoder में latent representation क्या है?

📌 Summary

Component	Function
Encoder	Input को compact form में बदलता है
Latent Vector	Input का encoded meaning
Decoder	Latent vector से output generate करता है
Uses	Translation, Captioning, Chatbots, etc.

What is an Autoencoder?

July 11, 2025 by Anand Singh

अब हम एक और महत्वपूर्ण और रोचक deep learning तकनीक को समझेंगे —
📦 Autoencoder
जिसका उपयोग representation learning, dimensionality reduction, और unsupervised learning में होता है।

🔶 1. Definition (परिभाषा):

Autoencoder एक ऐसा neural network है जिसे input को ही output के रूप में reproduce करने के लिए train किया जाता है — लेकिन इस प्रक्रिया में यह data के compressed और meaningful features सीखता है।

🎯 “Autoencoder खुद से feature सीखता है — बिना किसी label के।”

🧠 2. Structure of Autoencoder

Autoencoder में तीन मुख्य भाग होते हैं:

Encoder: Input को compress करता है (latent space में)
Latent Space: Data का compressed representation
Decoder: Compressed data को reconstruct करता है

Diagram:

      Input
        ↓
     [Encoder]
        ↓
  Compressed Code (Latent Vector)
        ↓
     [Decoder]
        ↓
     Reconstructed Output

🧮 3. Objective (Loss Function)

Autoencoder को train करने का उद्देश्य है:

जहाँ:

x: Original input
x^: Reconstructed output
L: Mean Squared Error (या कोई और loss)

🧱 4. Types of Autoencoders

Type	Description
🔹 Vanilla Autoencoder	Basic encoder-decoder structure
🔹 Sparse Autoencoder	Regularization to learn sparse codes
🔹 Denoising Autoencoder	Noisy input → Clean reconstruction
🔹 Variational Autoencoder (VAE)	Probabilistic latent space
🔹 Convolutional Autoencoder	CNN-based encoder-decoder for images

🔧 5. PyTorch Example (Basic Autoencoder)

import torch.nn as nn

class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()
        self.encoder = nn.Sequential(
            nn.Linear(784, 128),
            nn.ReLU(),
            nn.Linear(128, 32),
        )
        self.decoder = nn.Sequential(
            nn.Linear(32, 128),
            nn.ReLU(),
            nn.Linear(128, 784),
            nn.Sigmoid()
        )
        
    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

📊 6. Applications of Autoencoders

Application	Description
✅ Dimensionality Reduction	जैसे PCA से बेहतर
✅ Image Denoising	Noise हटाना
✅ Anomaly Detection	Outlier data पकड़ना
✅ Data Compression	Latent representation
✅ Image Colorization	Black & white → Color
✅ Feature Extraction	For downstream tasks

🔄 7. Difference from PCA

PCA	Autoencoder
Linear	Non-linear
No training	Trained via gradient descent
Limited capacity	Learn complex structure
Fixed basis	Flexible learned features

📝 Practice Questions:

Autoencoder क्या होता है?
Autoencoder में latent space का क्या महत्व है?
Vanilla और Denoising Autoencoder में क्या अंतर है?
Autoencoder का loss function क्या होता है?
Autoencoder का प्रयोग कहाँ किया जा सकता है?

🎯 Summary

Feature	Description
Input = Output	Learns to reconstruct data
Unsupervised	कोई label नहीं चाहिए
Encoder + Decoder	Compress और reconstruct करता है
Applications	Compression, denoising, anomaly detection

Pretrained Models (VGG, ResNet, Inception, BERT)

July 11, 2025 by Anand Singh

अब हम Deep Learning में Pretrained Models की बात करेंगे, जो Transfer Learning की रीढ़ की हड्डी हैं।
ये models पहले से बहुत बड़े datasets पर train हो चुके हैं, और इन्हें विभिन्न tasks में reuse किया जा सकता है।

🔶 1. What are Pretrained Models?

Pretrained Models वे deep learning architectures होते हैं जिन्हें पहले से किसी बड़े dataset (जैसे ImageNet या Wikipedia) पर train किया गया है।
आप इन्हें reuse करके:

Feature Extraction कर सकते हैं
Fine-Tuning कर सकते हैं
Zero-shot tasks भी perform कर सकते हैं (कुछ models)

🎯 क्यों ज़रूरी हैं?

✅ Save time and computation
✅ बेहतर performance, खासकर छोटे datasets पर
✅ Common architectures को standard बनाना
✅ Foundation models का निर्माण

🔷 A. Pretrained Models in Computer Vision

1. VGGNet

🧠 Developed by: Visual Geometry Group, Oxford
📆 Year: 2014
📐 Architecture: Simple CNNs with 3×3 convolutions
🧱 Versions: VGG-16, VGG-19
⚠️ Downside: Large number of parameters, slow

from torchvision import models
vgg = models.vgg16(pretrained=True)

2. ResNet (Residual Network)

resnet = models.resnet50(pretrained=True)

3. Inception (GoogLeNet)

🧠 By: Google
📆 Year: 2014
🔄 Inception Module: Multiple filter sizes in parallel
🧠 Deep but Efficient
📊 Version: Inception-v1, v2, v3, v4

inception = models.inception_v3(pretrained=True)

🔷 B. Pretrained Models in Natural Language Processing (NLP)

4. BERT (Bidirectional Encoder Representations from Transformers)

🧠 By: Google AI
📆 Year: 2018
🔍 Key Idea: Bidirectional context + Masked Language Modeling
🌍 Trained On: Wikipedia + BookCorpus
✅ Used for: Text classification, Q&A, NER, etc.
🔁 Fine-tune specific to downstream tasks

from transformers import BertModel, BertTokenizer

model = BertModel.from_pretrained("bert-base-uncased")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

📊 Comparison Table

Model	Domain	Strengths	Weakness
VGG	Vision	Simplicity	Too many parameters
ResNet	Vision	Deep + residual connections	Slightly complex
Inception	Vision	Multi-scale processing	Harder to modify
BERT	NLP	Powerful language understanding	Large memory usage

🧠 Use Cases of Pretrained Models

Task	Model
Image Classification	ResNet, VGG
Object Detection	Faster R-CNN with ResNet
Semantic Segmentation	DeepLab, U-Net
Sentiment Analysis	BERT
Machine Translation	mBERT, T5
Question Answering	BERT, RoBERTa

📝 Practice Questions

Pretrained model क्या होता है?
VGG और ResNet में क्या अंतर है?
Inception module का उद्देश्य क्या है?
BERT किस तरीके से context को समझता है?
Vision और NLP में कौन-कौन से pretrained models आम हैं?

🧠 Summary

Feature	Vision	NLP
Basic CNN	VGG	–
Deep Network	ResNet	BERT
Advanced Structure	Inception	Transformer variants
Library	`torchvision`	`transformers` (HuggingFace)