Unlock the Power of Artificial Intelligence, Machine Learning, and Data Science with our Blog
Discover the latest insights, trends, and innovations in Artificial Intelligence (AI), Machine Learning (ML), and Data Science through our informative and engaging Hubspot blog. Gain a deep understanding of how these transformative technologies are shaping industries and revolutionizing the way we work.
Stay updated with cutting-edge advancements, practical applications, and real-world use.
Wednesday, 18 March 2026
PyTorch Lightning Hyperparameter Optimization with Optuna
PyTorch Lightning Hyperparameter Optimization with Optuna Image by Author | Ideogram
PyTorch Lightning was released in recent years as a high-level alternative to the classical PyTorch library for deep learning modeling. It simplifies the process of training, validating, and deploying models. When it comes to hyperparameter optimization, that is, the process of finding the optimal set of model parameters that maximize performance on a given task, Optuna can be a great tool to be used in combination with PyTorch Lightning due to its seamless integration process and the efficient search algorithms it provides to find the best setting for your model among a ton of possible configurations.
This article shows how to jointly use PyTorch Lightning and Optuna to guide the hyperparameter optimization process for a deep learning model. It is recommended to have a basic knowledge of practical construction and training of neural networks, ideally with PyTorch.
Step-by-Step Process
The process starts by installing and importing a series of necessary libraries and modules, including PyTorch Lightning and Optuna. The initial installation process may take some time to complete.
pip install pytorch_lightning
pip install optuna
pip install optuna-integration[pytorch_lightning]
Now, the many imports:
import os
import torch
import torch.nn asnn
import torch.nn.functional asF
from torch.utils.data import DataLoader,random_split
from torchvision import datasets,transforms
import pytorch_lightning aspl
from pytorch_lightning.callbacks import EarlyStopping
from pytorch_lightning.loggers import TensorBoardLogger
import optuna
from optuna.integration import PyTorchLightningPruningCallback
When building neural network models with PyTorch Lightning, it is a common practice to set a random seed for reproducibility. You can do this by adding pl.seed_everything(42) at the start of your code, right after the imports.
Next, we define our neural network model architecture by creating a class that inherits pl.LightningModule: Lightning’s counterpart to PyTorch’s Module class.
We named our class MNISTClassifier because it is a simple feed-forward neural network classifier we will train on the MNIST dataset for low-resolution image classification. It consists of a small number of linear layers preceded by an input layer that flattens the two-dimensional image data, and ReLU activation function in between. The class also defines both the forward() method for forward passes, and other methods that emulate training, test, and validation steps.
Outside the newly defined class, the following function will also be handy to calculate the mean accuracy across a set of many predictions:
def accuracy(preds,y):
return(preds==y).float().mean()
The following function establishes a data preparation pipeline that we will later apply to the MNIST dataset. It converts the dataset to tensors, normalizes it based on a priori known statistics of this dataset, downloads and splits the training data into training and validation, and creates one DataLoader object for each of the three data subsets.
def prepare_data():
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,),(0.3081,))# Mean and Stdev of MNIST dataset
The objective() function is the core element Optuna provides for defining a hyperparameter optimization framework. We define it by first defining a search space: hyperparameters and possible values for them to try. There can be architectural hyperparameters related to the layers of the neural network, or hyperparameters related to the training algorithm, like the learning rate and dropout rate, for fighting issues like overfitting.
Inside this function, we try to initialize and train a model for each possible hyperparameter setting, with a callback included for early stopping if the model stabilizes early. Just like standard PyTorch, a Trainer is used to model the training process.
Another Optuna key function to govern the optimization process is run_optimization, where we indicate the number of random trials to run, based on the specifications defined in the previous optimization function.
Once the hyperparameter optimization process has been completed and the best model configuration is identified, another function is needed to take those results and evaluate that model’s performance on a test set for final validation.
def test_best_model(study):
# Getting the best hyperparameters
best_params=study.best_trial.params
# Creating the model with the best hyperparameters
print(f"Test results with best hyperparameters: {results}")
Here is a TL;DR breakdown of the workflow:
Run a study (set of Optuna experiments) specifying the number of trials
Perform a series of visualizations of the optimization process along training procedures
Once the best model is found, expose it to the test data to further evaluate it
Wrapping Up
This article illustrates how to use PyTorch Lightning and Optuna together to perform efficient and effective hyperparameter optimization for neural network models. Optuna provides enhanced algorithms for model tuning, and PyTorch Lightning is built on top of PyTorch to further simplify the process of neural network modeling at a higher level.
No comments:
Post a Comment