Skip to main content

Create, use, and analyze machine learning potentials within the many-body expansion framework

Project description

mbGDML

Create, use, and analyze machine learning potentials within the many-body expansion framework.

Documentation

Build Status Codecov DOI License Repo size Black style Black style

MotivationApproachFeaturesInstallationLicense

Motivation

Machine learning potentials (i.e., force fields) often rely on local descriptors for size transferability. These descriptors partition total properties into atomic contributions; however, they inherently neglect complicated long-range interactions by enforcing atomic radial cutoffs. Global descriptors encode the entire structure with no cutoffs and can capture interactions at all scales. However, they are restricted to systems with the same number of atoms.

Gradient-domain machine learning (GDML) is one example of a ML potential with a global descriptor. GDML is unique because it trains directly on forces and recovers total energy through analytical integration. This provides substantially more information about the potential energy surface (PES) and allows for better interpolation between training data. As a result, GDML typically only needs 1000 structures to accurately learn energies and forces.

To date, GDML has been limited to the exact system it was trained on. This makes simulations on arbitrarily size systems, like solvents, futile.

Approach

Many-body expansions (MBEs) rigorously decomposes total (i.e., supersystem) energies into fundamental n-body interactions. This expansion is formally exact when all N-body interactions are accounted for. In practice, however, it is typically truncated to the third order. One can then model any system by summing up 1-, 2-, and 3-body contributions.

MBEs driven by GDML potentials trained on n-body interactions is a promising approach for size-transferable potentials. Furthermore, GDML model's remarkable data efficiency enables training on highly accurate quantum chemical methods.

Features

Train

  • Train GDML models using grid searches, Bayesian optimization, or both on CPUs.
  • Custom loss functions.
  • Iterative training procedure for automated curation of optimal training sets.

Predict

  • Many-body predictions with GDML, SchNet and GAP potentials.
  • Parallel GDML predictions with ray from a laptop to multiple nodes.
  • Periodic structures with the minimum-image convention.
  • Alchemical predictions by tuning out 2- or 3-body contributions of specific entities.

Analysis

  • Prediction sets that store decomposed predictions for further analysis.
  • Radial distribution functions.
  • Cluster and identify problematic (i.e., high error) structures using sklearn.

Interfaces

Installation

You can install mbGDML from PyPI by using pip install mbGDML. Or, the latest development version can be installed directly from the GitHub repository or from TestPyPI.

git clone https://github.com/keithgroup/mbGDML
cd mbGDML
pip install .

Citing this work

If you find this code helpful in your research or project, please consider citing the following paper:

Maldonado, A. M.; Poltavsky, I.; Vassilev-Galindo, V.; Tkatchenko, A.; Keith, J. A. Modeling molecular ensembles with gradient-domain machine learning force fields. Digital Discovery 2023, 2 (3), 871-880. DOI: 10.1039/D3DD00011G.

@article{maldonado2023modeling,
  title={Modeling molecular ensembles with gradient-domain machine learning force fields},
  author={Maldonado, Alex M and Poltavsky, Igor and Vassilev-Galindo, Valentin and Tkatchenko, Alexandre and Keith, John A},
  journal={Digital Discovery},
  volume={2},
  number={3},
  pages={871--880},
  year={2023},
  publisher={Royal Society of Chemistry},
  doi={10.1039/D3DD00011G}
}

Citing the paper helps acknowledge the effort put into developing and maintaining this codebase, and it provides a way to support further research and development. Thank you for your support!

License

Distributed under the MIT License. See LICENSE for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mbGDML-0.1.1.tar.gz (143.4 kB view details)

Uploaded Source

Built Distribution

mbGDML-0.1.1-py3-none-any.whl (157.3 kB view details)

Uploaded Python 3

File details

Details for the file mbGDML-0.1.1.tar.gz.

File metadata

  • Download URL: mbGDML-0.1.1.tar.gz
  • Upload date:
  • Size: 143.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.17

File hashes

Hashes for mbGDML-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d496d938c47582297c30481ff3409b0e08ea9ff4bdb6a38171575d82519b0645
MD5 d67973af719cb7cd3b865e1a5aec1722
BLAKE2b-256 3ca7c78c3725e01bb5b05c46dd2494e6dbdb6aed59d01dca3fcc995ea7b31bd1

See more details on using hashes here.

File details

Details for the file mbGDML-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mbGDML-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 157.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.17

File hashes

Hashes for mbGDML-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5f17e673ad49d40315a0a43fd3fcac1e259dbef6624fd4d29b2b2e8f44444092
MD5 06d64bce5a190b82b55629250bfd7718
BLAKE2b-256 15a2d774c4cf2ef0f7a8ec7f07013fedb3bdd254c489b69b60037f1c0fc58899

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page