Path Optimization with a Continuous Representation Neural Network (popcornn) optimizes molecular reaction paths along analytic and MLIP potentials.
Project description
Popcornn
Path Optimization with a Continuous Representation Neural Network for reaction path with machine learning interatomic potentials
Installation and Dependencies
We recommend using conda environment to install dependencies of this library. Please install (or load) conda and then proceed with the following commands:
conda create --name popcornn python=3.12
conda activate popcornn
Now, you can install Popcornn in the conda environment by cloning this repository:
git clone https://github.com/khegazy/popcornn.git
pip install -e ./popcornn
Several machine learning potentials have been interfaced with Popcornn, such as CHGNet, EScAIP, LEFTNet, MACE, NewtonNet, Orb, UMA, etc. Please refer to the respective codespaces for the installation guides. To run the latest UMA model, you'll also have to set it up through HuggingFace.
Quick Start
You can find several run files inside the examples directory that rely on the implemented modules in the Popcornn library. We provide a simple run script, which needs to be accompanied with a yaml config file. You can run an example optimization script with the following command in the examples directory:
cd popcornn/examples
python run.py --config configs/rxn0003.yaml
All Popcornn parameters are specified in the config file. This example should complete in under an hour. Please note that we are still developing the convergence criteria, so you may adjust the number of optimization iterations to balance accuracy and speed.
For fast development and playing around with popcornn we offer two examples based on the the Wolfe potential
python run.py --config configs/wolfe.yaml
will run the fast Wolfe example and
python run.py --config configs/loss_example.yaml
will run the fast Wolfe example with more advanced loss capabilities. We note that values in the loss_example.yaml files are chosen to demonstrate the capabilities of popcornn and are not optimal values for either Wolfe or other systems.
Set up your own Popcornn
The config file is read in the run script as a dictionary, so you can also directly specify the configs in your own python script, giving you more handles on the inputs and outputs.
Initialize the path
The first step for you would be to specify the endpoints of the reaction you are working on:
from ase.io import read
images = read('configs/rxn0003.xyz', index=':')
for image in images:
image.info = {"charge": 0, "spin": 1} # if required by the MLIP, set the total charge and multiplicity
It can be a list of ASE Atoms, or if it's a string, we can also read xyz or traj files. If there are more than 2 frames provided, the path will be first fitted to go through the intermediate frames, but they are not fixed. Note that the reactant and product should be index-mapped, rotationally/translationally aligned, and ideally unwrapped if periodic. By default, if the periodic boundary conditions are applied, we unwrap the product according to the minimum image convention with respect to the reactant; however, if the cell is small and some atoms are expected to move more than half a cell, you should unwrap the frames manually and disable unwrap_positions in path_params (see below).
Next, you can set up the path using the images:
from popcornn import Popcornn
path = Popcornn(images=images, path_params={'name': 'mlp', 'n_embed': 1, 'depth': 2})
Optional initialization parameters for Popcornn include num_record_points for the number of frames to be recorded after optimization, output_dir for optional debug outputs, device, dtype, and seed. For simpler reactions, depth of 2 helps limit the complexity of the reaction, while more complicated reactions may require a deeper path neural network.
Optimize the path
Machine learning potentials are vulnerable to unphysical, out-of-distribution configurations, and it's important to resolve atom clashing as an interpolation step. Luckily, you can do both the interpolation and the optimization with Popcornn. The state of the art regarding the interpolation is to optimize the path on a monotonic, repulsive potential with respect to the geodesic loss, or so-called geodesic interpolation. In general, therefore, you need multiple optimizations by providing multiple optimization_params, each with a different potential, integral loss, and optimizer:
final_images, ts_image = path.optimize_path(
{
'potential_params': {'potential': 'repel'},
'integrator_params': {'path_ode_names': 'geodesic'},
'optimizer_params': {'optimizer': {'name': 'adam', 'lr': 1.0e-1}},
'num_optimizer_iterations': 1000,
},
{
'potential_params': {'potential': 'uma', 'model_name': 'uma-s-1', 'task_name': 'omol'},
'integrator_params': {'path_ode_names': 'projected_variational_reaction_energy', 'rtol': 1.0e-5, 'atol': 1.0e-7},
'optimizer_params': {'optimizer': {'name': 'adam', 'lr': 1.0e-3}},
'num_optimizer_iterations': 1000,
},
)
Finally, after optimization, you can save the optimized path as a list of Atoms for visualization and further optimization:
from ase.io import write
write('popcornn.xyz', final_images)
write('popcornn_ts.xyz', ts_image)
In this example, you should get a barrier of ~3.6 eV. To be fully rigorous, we suggest doing a subsequent saddle point optimization for the Popcornn transition state followed by forward/reverse intrisic reaction coordinate calculations, since Popcornn is not actually returning a minimum energy path but just targeting the transition state directly. Both saddle point optimization and intrisic reaction coordinate calculation are supported by Sella as ASE Optimizers.
Managing memory and handling potential OOM errors
Popcornn uses torchpathdiffeq to numerically calculate the path integral used in the loss. When calculating the path integral, torchpathdiffeq employs an adaptive evaluation mesh. This adaptive mesh results in varying batch sizes within a single torchpathdiffeq evaluation. The adaptive batch size may grow and can lead to out-of-memory errors (OOM). To avoid these OOM errors, torchpathdiffeq adaptively limits the batch size based on the cached and free GPU memory. While infrequent, it is possible for OOM errors to occur. There are two reasons for this, each with a solution. The user should apply these solutions in this order if an OOM error occurs.
- PyTorch memory management is optimized to run a single batch size many times, however, the adaptive path integration scheme in torchpathdiffeq varies the batch size. To allow PyTorch to resize the memory used for batch evaluations, the user must specify the following environment flag
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
or in Python, place
import os
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'expandable_segments:True'
at the top of the first Python file being called. This will solve most OOM errors.
- If the OOM error persists, then torchpathdiffeq is struggling to calculate the memory footprint for your problem. To solve this, the user must limit the available GPU memory torchpathdiffeq is allowed to use via the
total_mem_usageflag.total_mem_usageis a ratio between 0 and 1 that determines how much of the available GPU memory will be allocated for the next batch evaluation.total_mem_usageis set in the run config file, inside theintegration_params, with a default value of 0.9.
integrator_params:
path_ode_names: projected_variational_reaction_energy
rtol: 1.0e-5
atol: 1.0e-7
total_mem_usage: 0.75
Support
Popcornn is still under active development, and we welcome any feedback or contributions. Please open a GitHub issue if any problems are encountered!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file popcornn-0.1.0.tar.gz.
File metadata
- Download URL: popcornn-0.1.0.tar.gz
- Upload date:
- Size: 57.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f64eb27ad20b0bf4c7add1994c93404967e748db7c9d74979f1d30d1204cda2b
|
|
| MD5 |
a52c166d7a155ec123adf62f23270f65
|
|
| BLAKE2b-256 |
d7ca991cb3665dc8a83b8dd5b2435037bd26e80df741dad5d97e6770a16ee656
|
Provenance
The following attestation bundles were made for popcornn-0.1.0.tar.gz:
Publisher:
release.yaml on khegazy/popcornn
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
popcornn-0.1.0.tar.gz -
Subject digest:
f64eb27ad20b0bf4c7add1994c93404967e748db7c9d74979f1d30d1204cda2b - Sigstore transparency entry: 476450112
- Sigstore integration time:
-
Permalink:
khegazy/popcornn@458aef5616f455825a59e4dc509029eecefdc8df -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/khegazy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@458aef5616f455825a59e4dc509029eecefdc8df -
Trigger Event:
release
-
Statement type: