Learning Rate, Epochs, Batches

(लर्निंग रेट, एपॉक्स, और बैचेस)

🔶 1. Learning Rate (सीखने की रफ़्तार)

📌 Definition:

Learning Rate (η) एक hyperparameter है जो यह नियंत्रित करता है कि training के दौरान weights कितनी तेज़ी से update हों।

यह Gradient Descent के update rule का हिस्सा होता है:

🎯 Learning Rate की भूमिका:

Value	Effect
बहुत छोटा (<0.0001)	Slow learning, stuck in local minima
बहुत बड़ा (>1.0)	Overshooting, unstable training
सही मध्यम	Smooth convergence to minimum loss

📈 Visual Explanation:

Low LR: धीरे-धीरे valley में पहुंचता है
High LR: आगे-पीछे कूदता रहता है, valley मिस कर देता है
Ideal LR: सीधे valley में पहुँचता है

📘 PyTorch में Learning Rate:

optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

🔶 2. Epochs (Training Iterations over Dataset)

📌 Definition:

Epoch एक cycle होती है जिसमें पूरा training dataset once neural network में pass किया जाता है — forward + backward pass दोनों।

अगर आपके पास 1000 images हैं और आपने 10 epochs चलाए, तो model ने dataset को 10 बार देखा।

🎯 अधिक Epochs का मतलब:

Model को सीखने का ज्यादा मौका मिलता है
लेकिन overfitting का खतरा बढ़ता है

🔶 3. Batches और Batch Size

📌 Batch:

Dataset को छोटे-छोटे टुकड़ों (chunks) में divide करके training करना batch training कहलाता है।

हर batch पर forward और backward pass किया जाता है।

Batch Size: कितने samples एक साथ process होंगे
Common sizes: 8, 16, 32, 64, 128

🎯 Why Use Batches?

Advantage	Explanation
Memory Efficient	पूरा dataset memory में लोड करने की ज़रूरत नहीं
Faster Computation	GPU पर vectorized तरीके से काम होता है
Noise helps generalization	Stochastic updates model को overfitting से बचाते हैं

🔁 Relationship Between All Three:

Concept	Definition
Epoch	One full pass over the entire dataset
Batch Size	Number of samples processed at once
Iteration	One update step = One batch

Example:

Dataset size = 1000
Batch size = 100
Then, 1 epoch = 10 iterations
If we train for 10 epochs → total 100 iterations

🔧 PyTorch Code:

from torch.utils.data import DataLoader, TensorDataset
import torch

# Dummy data
X = torch.randn(1000, 10)
y = torch.randint(0, 2, (1000, 1)).float()
dataset = TensorDataset(X, y)

# DataLoader with batch size
dataloader = DataLoader(dataset, batch_size=64, shuffle=True)

# Training loop
for epoch in range(5):  # 5 epochs
    for batch_x, batch_y in dataloader:
        # Forward pass, loss calculation, backward, step
        ...

📝 Summary Table:

Term	Meaning	Typical Value
Learning Rate	Step size for weight updates	0.001 – 0.01
Epoch	One full pass over dataset	10 – 100
Batch Size	Samples per update	32, 64, 128
Iteration	One weight update step	dataset_size / batch_size

🎯 Objectives Recap:

Learning Rate = Weights कितना move करें
Epoch = Dataset कितनी बार pass हो
Batch Size = एक बार में कितने samples process हों
इन तीनों का tuning model performance के लिए critical है

📝 Practice Questions:

Learning Rate क्या होता है और इसका काम क्या है?
Batch Size और Iteration में क्या संबंध है?
Overfitting का खतरा किस स्थिति में अधिक होता है: कम epochs या ज़्यादा epochs?
PyTorch में DataLoader का क्या काम है?
Batch training क्यों करना ज़रूरी होता है?