CNN Layers (Convolution, Pooling, Flatten, Dense)

(CNN में Layers कैसे काम करती हैं?)

🔶 1. Overview of CNN Architecture

CNN का उद्देश्य raw images से automatically important features निकालना और उन्हें classification, detection या segmentation के लिए इस्तेमाल करना होता है।

CNN typically निम्नलिखित layers में बँटा होता है:

[Input Image]
   ↓
Convolution Layer
   ↓
Activation (ReLU)
   ↓
Pooling Layer
   ↓
(Repeat Conv + Pool) ...
   ↓
Flatten Layer
   ↓
Dense Layer(s)
   ↓
Output (e.g. Softmax)

🧩 2. Detailed Explanation of Each Layer

✅ A. Convolution Layer

Input image से features extract करता है
Multiple filters apply होते हैं
Output = Feature Maps

Math: y=w∗x+b

PyTorch:

nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, stride=1, padding=1)

✅ B. Activation Function (ReLU)

Non-linearity introduce करता है
Negative values को 0 कर देता है

f(x)=max⁡(0,x)

PyTorch:

nn.ReLU()

✅ C. Pooling Layer (MaxPooling or AvgPooling)

Feature map को compress करता है
Important features retain करता है
Overfitting reduce करता है

Types:

Max Pooling: Max value select करता है
Average Pooling: Average लेता है

PyTorch:

nn.MaxPool2d(kernel_size=2, stride=2)

✅ D. Flatten Layer

2D feature maps को 1D vector में convert करता है
ताकि Dense Layer उसे process कर सके

Example:

Shape: (Batch, Channels, Height, Width) → (Batch, Features)

PyTorch:

x = x.view(x.size(0), -1)

✅ E. Fully Connected (Dense) Layer

Neural network के traditional layer
Classification या regression करता है

PyTorch:

nn.Linear(in_features=512, out_features=10)

📊 CNN Architecture Diagram (Example)

Input: 32x32x3 (RGB Image)
   ↓
Conv (3x3, 16 filters) → 32x32x16
   ↓
ReLU
   ↓
MaxPool (2x2) → 16x16x16
   ↓
Conv (3x3, 32 filters) → 16x16x32
   ↓
ReLU
   ↓
MaxPool (2x2) → 8x8x32
   ↓
Flatten → 2048
   ↓
Dense → 128
   ↓
Output → 10 classes

🔧 3. PyTorch Code (Simple CNN Model)

import torch.nn as nn

class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv_layers = nn.Sequential(
            nn.Conv2d(3, 16, 3, padding=1),  # Conv1
            nn.ReLU(),
            nn.MaxPool2d(2, 2),             # Pool1
            nn.Conv2d(16, 32, 3, padding=1),# Conv2
            nn.ReLU(),
            nn.MaxPool2d(2, 2)              # Pool2
        )
        self.fc_layers = nn.Sequential(
            nn.Flatten(),
            nn.Linear(32 * 8 * 8, 128),     # Flatten to Dense
            nn.ReLU(),
            nn.Linear(128, 10)              # Output
        )

    def forward(self, x):
        x = self.conv_layers(x)
        x = self.fc_layers(x)
        return x

📈 Summary Table:

Layer	Purpose	Output Shape
Conv	Features extract	(H, W, Filters)
ReLU	Non-linearity	Same as input
Pooling	Downsample	(H/2, W/2, Filters)
Flatten	Vector बनाना	(1D features)
Dense	Predict classes	Output classes

📝 Practice Questions:

Convolution layer में filter क्या करता है?
MaxPooling और AveragePooling में क्या फर्क है?
Flatten क्यों ज़रूरी होता है CNN में?
Dense Layer का काम क्या होता है CNN में?
PyTorch में CNN model define करने का तरीका बताइए।