Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks
Project description
CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks
CatTSunami is a framework for high-throughput enumeration of nudged elastic band (NEB) frame sets. It was built for use with machine learned (ML) models trained on OC20, which were demonstrated to be performant on this auxiliary task. To train your own model or obtain pre-trained checkpoints, please see fairchem-core
.
This repository contains the validation dataset, framework for enumeration, and accompanying code to run ML-accelerated NEBs and validate new models. For more information, please read the manuscript paper.
Getting started
Configured for use:
- Install fairchem-core and fairchem-data-oc instructions
- Pip innstall fairchem-applications-cattsunami
- Check out the tutorial notebook
pip install fairchem-applications-cattsunami
Configured for local development:
- Clone the fairchem repo
- Install
fairchem-data-oc
andfairchem-core
: instructions - Install this repository
pip install -e packages/fairchem-applications-cattsunami
- Check out the tutorial notebook
Validation Dataset
The validation dataset is comprised of 932 converged DFT NEB calculations to assess model performance on this important task. There are 3 different reaction classes considered: desorptions, dissociations, and transfers. There were 2827 total DFT NEBS performed including those that failed to converge. Unconverged systems have also been included in ASE All Trajectories below. For more information about the converged dataset see the dataset markdown file.
Splits | Size of compressed version (in bytes) | Size of uncompressed version (in bytes) | MD5 checksum (download link) |
---|---|---|---|
ASE Converged Trajectories | 1.5G | 6.3G | 52af34a93758c82fae951e52af445089 |
ASE All Trajectories | 6.7G | 30G | f5829eeaf7219c5cd3cfb499b8d951da |
Citing this work
If you use this codebase in your work, please consider citing:
@article{wander2024cattsunami,
title={CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks},
author={Wander, Brook and Shuaibi, Muhammed and Kitchin, John R and Ulissi, Zachary W and Zitnick, C Lawrence},
journal={arXiv preprint arXiv:2405.02078},
year={2024}
}
File Structure and Contents
The tar file contains 3 subdirectories: dissociations, desorptions, and transfers. As the names imply, these directories contain the converged DFT trajectories for each of the reaction classes. Within these directories, the trajectories are named to identify the contents of the file. Here is an example and the anatomy of the name:
desorption_id_83_2409_9_111-4_neb1.0.traj
desorption
indicates the reaction type (dissociation and transfer are the other possibilities)id
identifies that the material belongs to the validation in domain split (ood - out of domain is th e other possibility)83
is the task id. This does not provide relavent information2409
is the bulk index of the bulk used in the ocdata bulk pickle file9
is the reaction index. for each reaction type there is a reaction pickle file in the repository. In this case it is the 9th entry to that pickle file111-4
the first 3 numbers are the miller indices (i.e. the (1,1,1) surface), and the last number cooresponds to the shift value. In this case the 4th shift enumerated was the one used.neb1.0
the number here indicates the k value used. For the full dataset, 1.0 was used so this does not distiguish any of the trajectories from one another.
The content of these trajectory files is the repeating frame sets. Despite the initial and final frames not being optimized during the NEB, the initial and final frames are saved for every iteration in the trajectory. For the dataset, 10 frames were used - 8 which were optimized over the neb. So the length of the trajectory is the number of iterations (N) * 10. If you wanted to look at the frame set prior to optimization and the optimized frame set, you could get them like this:
from ase.io import read
traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
unrelaxed_frames = traj[0:10]
relaxed_frames = traj[-10:]
Use
One more note: We have not prepared an lmdb for this dataset. This is because it is NEB calculations are not supported directly in ocp. You must use the ase native OCP class along with ase infrastructure to run NEB calculations. Here is an example of a use:
from ase.io import read
from ase.optimize import BFGS
from ocpneb.core.ocpneb import OCPNEB
traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
neb_frames = traj[0:10]
neb = OCPNEB(
neb_frames,
checkpoint_path=YOUR_CHECKPOINT_PATH,
k=k,
batch_size=8,
)
optimizer = BFGS(
neb,
trajectory=f"test_neb.traj",
)
conv = optimizer.run(fmax=0.45, steps=200)
if conv:
neb.climb = True
conv = optimizer.run(fmax=0.05, steps=300)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fairchem_applications_cattsunami-0.2.0.tar.gz
.
File metadata
- Download URL: fairchem_applications_cattsunami-0.2.0.tar.gz
- Upload date:
- Size: 927.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 293531fc19a7d89986aec5466bf4b7fd47a4880fe7ee91bd663a44d15481cf73 |
|
MD5 | 454f1a48d363a2ac371c37d51b47638b |
|
BLAKE2b-256 | 7c3c4dc5fc74c599ecb493741bccf24565bacb8b80a0f9c1b8785f75b956ecae |
Provenance
The following attestation bundles were made for fairchem_applications_cattsunami-0.2.0.tar.gz
:
Publisher:
release.yml
on FAIR-Chem/fairchem
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
fairchem_applications_cattsunami-0.2.0.tar.gz
- Subject digest:
293531fc19a7d89986aec5466bf4b7fd47a4880fe7ee91bd663a44d15481cf73
- Sigstore transparency entry: 148228884
- Sigstore integration time:
- Predicate type:
File details
Details for the file fairchem_applications_cattsunami-0.2.0-py2.py3-none-any.whl
.
File metadata
- Download URL: fairchem_applications_cattsunami-0.2.0-py2.py3-none-any.whl
- Upload date:
- Size: 930.6 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1d95b51b9c8ba656a964e9b4d12f4e5857ce4e089715c81ac71d892985915c8 |
|
MD5 | 22fc94697d9ef241e92f23d14c83ae49 |
|
BLAKE2b-256 | 2853c7d943e4b02ef3b3e3353d06906dd25d6c8e815d696836d4186519f25704 |
Provenance
The following attestation bundles were made for fairchem_applications_cattsunami-0.2.0-py2.py3-none-any.whl
:
Publisher:
release.yml
on FAIR-Chem/fairchem
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
fairchem_applications_cattsunami-0.2.0-py2.py3-none-any.whl
- Subject digest:
d1d95b51b9c8ba656a964e9b4d12f4e5857ce4e089715c81ac71d892985915c8
- Sigstore transparency entry: 148228885
- Sigstore integration time:
- Predicate type: