Model Training, Saving, and Loading in PyTorch

अब हम PyTorch में Model Training, फिर उसे Save और Load करने की पूरी प्रक्रिया विस्तार से सीखते हैं —
जो किसी भी Deep Learning project का core हिस्सा है।


🔷 1. 🔁 Model Training in PyTorch

🧱 Training Steps Overview:

  1. Model बनाना (nn.Module)
  2. Loss function चुनना (nn.CrossEntropyLoss, etc.)
  3. Optimizer सेट करना (torch.optim)
  4. Forward pass करना
  5. Loss calculate करना
  6. Backward pass (loss.backward())
  7. Optimizer step (optimizer.step())

🧪 Full Example (Classifier):

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

# Sample data
X = torch.tensor([[0.,0.],[0.,1.],[1.,0.],[1.,1.]])
y = torch.tensor([[0.],[1.],[1.],[0.]])

dataset = TensorDataset(X, y)
dataloader = DataLoader(dataset, batch_size=2, shuffle=True)

# Step 1: Model
class XORNet(nn.Module):
def __init__(self):
super(XORNet, self).__init__()
self.fc1 = nn.Linear(2, 4)
self.fc2 = nn.Linear(4, 1)

def forward(self, x):
x = torch.relu(self.fc1(x))
return torch.sigmoid(self.fc2(x))

model = XORNet()

# Step 2: Loss and Optimizer
criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.1)

# Step 3: Training loop
for epoch in range(500):
for xb, yb in dataloader:
y_pred = model(xb)
loss = criterion(y_pred, yb)

optimizer.zero_grad()
loss.backward()
optimizer.step()

if epoch % 100 == 0:
print(f"Epoch {epoch}, Loss: {loss.item():.4f}")

🔷 2. 💾 Saving a PyTorch Model

PyTorch में 2 तरीके हैं model save करने के:

✅ Option 1: State Dict Only (Recommended)

torch.save(model.state_dict(), "xor_model.pth")

यह केवल model के weights save करता है, architecture नहीं।


✅ Option 2: Complete Model (Not Recommended)

torch.save(model, "xor_model_full.pth")

यह पूरा model + structure save करता है, पर version compatibility issues आ सकते हैं।


🔷 3. 📂 Loading a Model

🔁 Load from State Dict (Best Practice):

model = XORNet()  # पहले architecture बनाओ
model.load_state_dict(torch.load("xor_model.pth"))
model.eval() # evaluation mode में डालना ज़रूरी

🔍 .eval() inference के समय BatchNorm / Dropout को deactivate करता है।


🧠 Bonus: GPU Compatibility (Saving/Loading)

✅ Save on GPU, Load on CPU:

# Save (from GPU)
torch.save(model.state_dict(), "model_gpu.pth")

# Load on CPU
device = torch.device("cpu")
model.load_state_dict(torch.load("model_gpu.pth", map_location=device))

🧪 Example: Inference After Loading

model.eval()
with torch.no_grad():
test = torch.tensor([[1., 1.]])
output = model(test)
print("Predicted:", output.item())

📦 Advanced Tip: Save Optimizer State Too

torch.save({
'model_state': model.state_dict(),
'optimizer_state': optimizer.state_dict(),
}, "checkpoint.pth")

Load Later:

checkpoint = torch.load("checkpoint.pth")
model.load_state_dict(checkpoint['model_state'])
optimizer.load_state_dict(checkpoint['optimizer_state'])

🧠 Evaluation Tips

  • हमेशा model.eval() use करें inference के लिए
  • torch.no_grad() में prediction करें (memory efficiency के लिए)
  • Large models के लिए model checkpoints का उपयोग करें

📝 Practice Questions:

  1. PyTorch में model train करने के मुख्य steps क्या हैं?
  2. State dict और full model saving में क्या अंतर है?
  3. Optimizer state क्यों save करना चाहिए?
  4. .eval() और .train() modes में क्या फर्क है?
  5. Inference में torch.no_grad() का उपयोग क्यों करें?

🧠 Summary Table

TaskMethod
🔧 TrainForward → Loss → Backward → Optimizer
💾 Save Weightstorch.save(model.state_dict(), path)
📂 Load Weightsmodel.load_state_dict(torch.load(path))
📌 Inferencemodel.eval() + with torch.no_grad()
🔁 Save with OptimizerUse checkpoint = {'model': ..., 'opt': ...}