A simple Multi-Layer Perceptron (MLP) in PyTorch to learn the XOR function

import torch
import torch.nn as nn
import torch.optim as optim

X = torch.tensor([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=torch.float32)
Y = torch.tensor([[0], [1], [1], [0]], dtype=torch.float32)

class MLP(nn.Module):
def init(self):
super(MLP, self).init()
self.hidden = nn.Linear(2, 2) # 2 neurons in hidden layer
self.output = nn.Linear(2, 1) # 1 neuron in output layer
self.sigmoid = nn.Sigmoid() # Activation function

def forward(self, x):
    x = self.sigmoid(self.hidden(x))  # Hidden layer
    x = self.sigmoid(self.output(x))  # Output layer
    return x

model = MLP()
criterion = nn.MSELoss() # Mean Squared Error Loss
optimizer = optim.SGD(model.parameters(), lr=0.1) # Stochastic Gradient Descent

epochs = 1000
for epoch in range(epochs):
optimizer.zero_grad() # Clear gradients
output = model(X) # Forward pass
loss = criterion(output, Y) # Compute loss
loss.backward() # Backpropagation
optimizer.step() # Update weights

if (epoch + 1) % 100 == 0:
    print(f"Epoch {epoch+1}/{epochs}, Loss: {loss.item():.4f}")

print(“\nFinal Predictions After Training:”)
print(model(X).detach().numpy()) # Convert to NumPy for readability

A simple Multi-Layer Perceptron (MLP) in PyTorch to learn the XOR function. Let’s break it line by line and understand the purpose of each keyword and expression, especially with a beginner-friendly explanation.

import torch
import torch.nn as nn
import torch.optim as optim

import torch.nn as nn: Imports the neural network module, which includes layers, activation functions, etc.

import torch.optim as optim: Imports optimization algorithms (e.g., SGD, Adam) to update model weights during training.

import torch: Loads the PyTorch library, which allows you to work with tensors (like arrays) and build models.

X = torch.tensor([[0,0],[0,1],[1,0],[1,1]], dtype=torch.float32)
Y = torch.tensor([[0],[1],[1],[0]], dtype=torch.float32)

torch.tensor(...): Converts lists into PyTorch tensors.

dtype=torch.float32: Ensures inputs are in float format (required for neural networks).

X is the input (2-bit values for XOR), and Y is the target (output of XOR).
So:

0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0

class MLP(nn.Module):
def init(self):
super(MLP,self).init()
self.hidden = nn.Linear(2,2)
self.output = nn.Linear(2,1)
self.sigmoid = nn.Sigmoid()

class MLP(nn.Module): Defines a custom neural network called MLP, subclassing nn.Module (base class for all models).

def __init__(self): Constructor; defines the layers and activations.

super(MLP, self).__init__(): Initializes the parent class (nn.Module) to use its features.

self.hidden = nn.Linear(2, 2): First layer (input to hidden); 2 inputs → 2 hidden units.

self.output = nn.Linear(2, 1): Second layer (hidden to output); 2 inputs → 1 output.

self.sigmoid = nn.Sigmoid(): Sigmoid activation to introduce non-linearity, necessary for learning XOR.

def forward(self, x):
x = self.sigmoid(self.hidden(x))
x = self.sigmoid(self.output(x))
return x

forward(): This is automatically called when you do model(X). It defines the flow of data in the network.

self.hidden(x): Applies the hidden layer (matrix multiplication + bias).

self.sigmoid(...): Applies sigmoid to each layer’s output.

return x: Gives the final output.

model = MLP()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

model = MLP(): Instantiates the model.

criterion = nn.MSELoss(): Mean Squared Error Loss function (commonly used in regression and simple binary tasks).

optimizer = optim.SGD(model.parameters(), lr=0.1):

  • SGD is Stochastic Gradient Descent (used to update weights).
  • model.parameters(): Gets the weights and biases of the model.
  • lr=0.1: Learning rate controls how fast the model learns.

epochs = 1000
for epoch in range(epochs):
optimizer.zero_grad()
output = model(X)
loss = criterion(output, Y)
loss.backward()
optimizer.step()

Explanation:

  • epochs = 1000: Number of training iterations.
  • for epoch in range(epochs): Loop over the training process.
  • optimizer.zero_grad(): Clears gradients from the previous step (important!).
  • output = model(X): Runs a forward pass through the model.
  • loss = criterion(output, Y): Calculates loss between predicted output and true output.
  • loss.backward(): Backpropagates the loss (computes gradients).
  • optimizer.step(): Updates the model parameters using gradients.

if (epoch+1)%100 == 0:
print(f”Epoch {epoch+1}/{epochs}, Loss: {loss.item():.4f}”)

Every 100 epochs, prints the loss.

loss.item() gets the Python number from the tensor for display.

:.4f formats it to 4 decimal places.

print(“\nFinal Predictions after Trainnig:”)
print(model(X).detach().numpy())

model(X): Feeds the input to the trained model.

.detach(): Detaches the output from the computation graph. We use this because we don’t need gradients during inference.

.numpy(): Converts the tensor into a NumPy array for easy reading/printing.

✅ WHY USE THESE KEYWORDS?

KeywordWhy It’s Used
super()Initializes base class (nn.Module)
Sigmoid()Required for non-linearity to solve XOR
zero_grad()Clears old gradients to avoid accumulation
backward()Computes gradients using backpropagation
step()Updates weights using gradients
detach()Stops PyTorch from tracking computations
numpy()Converts tensor to NumPy array for viewing

Categories ML

What is AutoML

AutoML, or Automated Machine Learning, refers to the process of automating the end-to-end tasks of applying machine learning to real-world problems. It aims to make machine learning accessible to non-experts and improve the efficiency of experts by automating the complex and time-consuming tasks involved in creating machine learning models.

Key Components of AutoML:

  1. Data Preprocessing: AutoML systems automate the process of cleaning and preparing raw data, which can include tasks like handling missing values, normalizing data, encoding categorical variables, and feature selection.
  2. Feature Engineering: AutoML can automatically create new features from the raw data that might be more informative for the machine learning model. This step is crucial as it can significantly impact the performance of the model.
  3. Model Selection: Instead of manually selecting a machine learning algorithm, AutoML systems can automatically choose the best algorithm for a given task. This is done by evaluating multiple algorithms and selecting the one that performs best according to specific criteria, such as accuracy or efficiency.
  4. Hyperparameter Optimization: AutoML systems automatically tune the hyperparameters of machine learning models. Hyperparameters are the settings that control the behavior of the learning algorithm and can have a significant impact on model performance. AutoML uses techniques like grid search, random search, or more advanced methods like Bayesian optimization to find the best hyperparameter values.
  5. Neural Architecture Search (NAS): In deep learning, AutoML can be used to automatically design the architecture of neural networks. This involves searching for the best network structure, such as the number of layers, types of layers, and connections between layers, to optimize performance.
  6. Model Evaluation: AutoML systems typically include automated methods for evaluating model performance. This can involve cross-validation, testing on holdout datasets, or other techniques to ensure that the model generalizes well to new data.
  7. Model Deployment: Some AutoML tools also automate the deployment of models into production environments, making it easier to integrate machine learning into applications.

Benefits of AutoML:

  • Accessibility: AutoML lowers the barrier to entry for those who are not experts in machine learning, allowing more people to leverage AI in their work.
  • Efficiency: Automating the machine learning process can save time and resources, allowing data scientists to focus on higher-level tasks and problem-solving.
  • Optimization: AutoML often results in better-performing models because it can explore a larger space of possible models and configurations than a human could manually.

Applications of AutoML:

AutoML is used in various domains such as:

  • Image Processing: For tasks like image classification, object detection, and segmentation.
  • Natural Language Processing (NLP): For text classification, sentiment analysis, and translation.
  • Predictive Modeling: In finance, healthcare, and marketing for predicting outcomes like stock prices, patient diagnoses, or customer churn.
  • Recommender Systems: Automatically generating recommendations for users in e-commerce, streaming services, etc.

In summary, AutoML democratizes machine learning by automating many of the complex steps involved in creating and deploying models, making it easier for non-experts to build powerful AI systems while also enhancing the productivity of experienced data scientists.

Categories ML