Skip to main content

Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks

Project description

CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks

summary

CatTSunami is a framework for high-throughput enumeration of nudged elastic band (NEB) frame sets. It was built for use with machine learned (ML) models trained on OC20, which were demonstrated to be performant on this auxiliary task. To train your own model or obtain pre-trained checkpoints, please see fairchem-core.

This repository contains the validation dataset, framework for enumeration, and accompanying code to run ML-accelerated NEBs and validate new models. For more information, please read the manuscript paper.

Getting started

Configured for use:

  1. Install fairchem-core and fairchem-data-oc instructions
  2. Pip innstall fairchem-applications-cattsunami
  3. Check out the tutorial notebook
pip install fairchem-applications-cattsunami

Configured for local development:

  1. Clone the fairchem repo
  2. Install fairchem-data-oc and fairchem-core: instructions
  3. Install this repository pip install -e packages/fairchem-applications-cattsunami
  4. Check out the tutorial notebook

Validation Dataset

The validation dataset is comprised of 932 converged DFT NEB calculations to assess model performance on this important task. There are 3 different reaction classes considered: desorptions, dissociations, and transfers. There were 2827 total DFT NEBS performed including those that failed to converge. Unconverged systems have also been included in ASE All Trajectories below. For more information about the converged dataset see the dataset markdown file.

Splits Size of compressed version (in bytes) Size of uncompressed version (in bytes) MD5 checksum (download link)
ASE Converged Trajectories 1.5G 6.3G 52af34a93758c82fae951e52af445089
ASE All Trajectories 6.7G 30G f5829eeaf7219c5cd3cfb499b8d951da

Citing this work

If you use this codebase in your work, please consider citing:

@article{wander2024cattsunami,
  title={CatTSunami: Accelerating Transition State Energy Calculations with Pre-trained Graph Neural Networks},
  author={Wander, Brook and Shuaibi, Muhammed and Kitchin, John R and Ulissi, Zachary W and Zitnick, C Lawrence},
  journal={arXiv preprint arXiv:2405.02078},
  year={2024}
}

File Structure and Contents

The tar file contains 3 subdirectories: dissociations, desorptions, and transfers. As the names imply, these directories contain the converged DFT trajectories for each of the reaction classes. Within these directories, the trajectories are named to identify the contents of the file. Here is an example and the anatomy of the name:

desorption_id_83_2409_9_111-4_neb1.0.traj

  1. desorption indicates the reaction type (dissociation and transfer are the other possibilities)
  2. id identifies that the material belongs to the validation in domain split (ood - out of domain is th e other possibility)
  3. 83 is the task id. This does not provide relavent information
  4. 2409 is the bulk index of the bulk used in the ocdata bulk pickle file
  5. 9 is the reaction index. for each reaction type there is a reaction pickle file in the repository. In this case it is the 9th entry to that pickle file
  6. 111-4 the first 3 numbers are the miller indices (i.e. the (1,1,1) surface), and the last number cooresponds to the shift value. In this case the 4th shift enumerated was the one used.
  7. neb1.0 the number here indicates the k value used. For the full dataset, 1.0 was used so this does not distiguish any of the trajectories from one another.

The content of these trajectory files is the repeating frame sets. Despite the initial and final frames not being optimized during the NEB, the initial and final frames are saved for every iteration in the trajectory. For the dataset, 10 frames were used - 8 which were optimized over the neb. So the length of the trajectory is the number of iterations (N) * 10. If you wanted to look at the frame set prior to optimization and the optimized frame set, you could get them like this:

from ase.io import read

traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
unrelaxed_frames = traj[0:10]
relaxed_frames = traj[-10:]

Use

One more note: We have not prepared an lmdb for this dataset. This is because it is NEB calculations are not supported directly in ocp. You must use the ase native OCP class along with ase infrastructure to run NEB calculations. Here is an example of a use:

from ase.io import read
from ase.optimize import BFGS
from fairchem.core import pretrained_mlip, FAIRChemCalculator
from ase.mep import DyNEB

traj = read("desorption_id_83_2409_9_111-4_neb1.0.traj", ":")
images = traj[0:10]
predictor = pretrained_mlip.get_predict_unit("uma-s-1")

neb = DyNEB(images, k=1)
for image in images:
    image.calc = FAIRChemCalculator(predictor, task_name="oc20")

optimizer = BFGS(
    neb,
    trajectory=f"test_neb.traj",
)

conv = optimizer.run(fmax=0.45, steps=200)
if conv:
    neb.climb = True
    conv = optimizer.run(fmax=0.05, steps=300)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairchem_applications_cattsunami-1.1.0.tar.gz (920.7 kB view details)

Uploaded Source

Built Distribution

fairchem_applications_cattsunami-1.1.0-py2.py3-none-any.whl (922.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file fairchem_applications_cattsunami-1.1.0.tar.gz.

File metadata

File hashes

Hashes for fairchem_applications_cattsunami-1.1.0.tar.gz
Algorithm Hash digest
SHA256 d09f21db9580ddca64faf472e4f65e6f3889f21f156b655f0c1a699bc654687e
MD5 f16ac15eef033c9b9a236d0e234a9722
BLAKE2b-256 4d290d6e34a39b7ae3dabc6ccaf16a07f65994a1a1b23b211ff0da32968d8662

See more details on using hashes here.

Provenance

The following attestation bundles were made for fairchem_applications_cattsunami-1.1.0.tar.gz:

Publisher: release.yml on facebookresearch/fairchem

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fairchem_applications_cattsunami-1.1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for fairchem_applications_cattsunami-1.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 527c60633107994f99cfcfe495e0ee4551559945fec311ab0c4db80d0b0de6fc
MD5 650aa435f108dfee612cbc4ff807db41
BLAKE2b-256 3b31fc92e805facb4cfa21d2c2abaf731291c8f1d60d0848c1c6c977c00a7a24

See more details on using hashes here.

Provenance

The following attestation bundles were made for fairchem_applications_cattsunami-1.1.0-py2.py3-none-any.whl:

Publisher: release.yml on facebookresearch/fairchem

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page