Deep Neural Networks (DNN)

(डीप न्यूरल नेटवर्क्स)

🔶 1. What is a Deep Neural Network?

📌 परिभाषा:

Deep Neural Network (DNN) एक ऐसा artificial neural network होता है जिसमें एक से ज़्यादा hidden layers होते हैं।

👉 यह shallow network (जैसे simple MLP जिसमें 1 hidden layer हो) से अलग होता है क्योंकि इसमें “depth” होती है — यानी कई layers जो input से output तक data को progressively abstract करती हैं।

🧠 Structure of a DNN:

Input Layer → Hidden Layer 1 → Hidden Layer 2 → ... → Hidden Layer N → Output Layer

हर layer neurons का group होता है
Each neuron applies:

z=w⋅x+b, a=f(z)

जहाँ f कोई activation function होता है

📊 Example:

मान लीजिए एक DNN जिसमें:

Input Layer: 784 nodes (28×28 image pixels)
Hidden Layer 1: 512 neurons
Hidden Layer 2: 256 neurons
Output Layer: 10 neurons (digits 0–9 classification)

🔷 2. Why Use Deep Networks?

❓ क्यों shallow networks काफी नहीं होते?

Shallow networks simple problems के लिए ठीक हैं
लेकिन complex tasks (जैसे image recognition, NLP, audio classification) में input-output relationship बहुत nonlinear होती है

✅ Deep networks:

High-level features को automatically extract कर सकते हैं
Abstractions को hierarchy में capture करते हैं

🧠 Hierarchical Feature Learning:

Layer	Learns
Layer 1	Edges, curves
Layer 2	Shapes, textures
Layer 3	Objects, faces

🔶 DNN की Architecture क्या होती है?

Architecture का मतलब होता है कि DNN में कितनी layers हैं, हर layer में कितने neurons हैं, activation functions क्या हैं, और input-output data का flow कैसा है।

📊 High-Level Structure:

Input Layer → Hidden Layer 1 → Hidden Layer 2 → ... → Output Layer

हर layer दो चीज़ें करती है:

Linear Transformation z=W⋅x+b
Activation Function a=f(z)

🔷 2. Components of a DNN Architecture

Component	Description
Input Layer	Raw input data (e.g., image pixels, features)
Hidden Layers	Intermediate processing layers (more = more depth)
Output Layer	Final predictions (e.g., class scores)
Weights & Biases	Parameters learned during training
Activation Functions	Adds non-linearity (ReLU, Sigmoid, etc.)
Loss Function	Measures prediction error
Optimizer	Updates weights using gradients (SGD, Adam)

🧠 Typical Architecture Example (MNIST Digits):

Layer Type	Shape	Notes
Input	(784,)	28×28 image flattened
Dense 1	(784 → 512)	Hidden Layer 1 + ReLU
Dense 2	(512 → 256)	Hidden Layer 2 + ReLU
Output	(256 → 10)	Digit prediction + Softmax

🧮 3. Mathematical View

🔧 4. PyTorch Code: Custom DNN Architecture

import torch.nn as nn

class DNN(nn.Module):
    def __init__(self):
        super(DNN, self).__init__()
        self.net = nn.Sequential(
            nn.Linear(784, 512),     # Input to Hidden 1
            nn.ReLU(),
            nn.Linear(512, 256),     # Hidden 1 to Hidden 2
            nn.ReLU(),
            nn.Linear(256, 10)       # Output Layer
        )

    def forward(self, x):
        return self.net(x)

📈 Visualization of Architecture

[Input Layer: 784]
         ↓
[Dense Layer: 512 + ReLU]
         ↓
[Dense Layer: 256 + ReLU]
         ↓
[Output Layer: 10 (classes)]

🔍 Key Architecture Design Questions

कितनी hidden layers होनी चाहिए?
हर layer में कितने neurons?
कौन सा activation function चुनना है?
क्या dropout, batch norm चाहिए?
Loss function कौन सा है?

🎯 Summary:

Element	Role
Layers	Input → Hidden(s) → Output
Activation	Non-linearity लाती है
Depth	Layers की संख्या
Width	Neurons per layer
Optimizer	Gradient से weights update करता है

📝 Practice Questions:

DNN की architecture में कौन-कौन से भाग होते हैं?
Hidden layers कितनी होनी चाहिए — इससे क्या फर्क पड़ता है?
Activation function का क्या महत्व है architecture में?
DNN architecture में overfitting कैसे रोका जाता है?
Architecture tuning कैसे किया जाता है?

🔶 Training a DNN

💡 Standard Process:

Forward Pass: Prediction generate करना
Loss Calculation: Prediction vs ground truth
Backward Pass: Gradient computation
Optimizer Step: Weights update

🚧 Challenges in Training Deep Networks:

Challenge	Solution
Vanishing Gradients	ReLU, BatchNorm, Residual connections
Overfitting	Dropout, Data Augmentation
Computational Cost	GPU acceleration, Mini-batch training

🔧 4. PyTorch Code: Simple DNN for Classification

import torch.nn as nn

class SimpleDNN(nn.Module):
    def __init__(self):
        super(SimpleDNN, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(784, 512),
            nn.ReLU(),
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.Linear(256, 10)
        )

    def forward(self, x):
        return self.model(x)

🔬 5. Applications of DNNs

Domain	Use Case
Computer Vision	Image classification, Object detection
NLP	Text classification, Sentiment analysis
Healthcare	Disease prediction from X-rays
Finance	Credit scoring, Fraud detection
Robotics	Sensor fusion, control systems

📈 Summary:

Term	Meaning
DNN	Neural network with 2+ hidden layers
Depth	Refers to number of layers
Power	Learns complex mappings from data
Challenges	Vanishing gradients, Overfitting, Compute cost

📝 Practice Questions:

DNN और shallow network में क्या फर्क है?
DNN के training में कौन-कौन सी steps होती हैं?
Vanishing gradient क्या होता है और इसे कैसे solve किया जाता है?
PyTorch में DNN implement करने का तरीका बताइए।
DNN किन-किन क्षेत्रों में प्रयोग किया जाता है?