Tools for working with MatPES.
Project description
Introduction
Welcome to MatPES, a foundational potential energy surface (PES) dataset for materials. This is a collaboration between the Materials Virtual Lab and the Materials Project.
Machine learning interatomic potentials (MLIPs) have revolutionized the field of computational materials science. MLIPs use ML to reproduce the PES (energies, forces, and stresses) of a collection of atoms, typically computed using an ab initio method such as density functional theory (DFT). This enables the simulation of materials at much larger length and longer time scales at near-ab initio accuracy.
One of the most exciting developments in the past few years is the emergence of universal MLIPs (uMLIPs, aka materials foundational models), with near-complete coverage of the periodic table of elements. Examples include M3GNet,[^1] CHGNet,[^2] MACE,[^3] to name a few. uMLIPs have broad applications, including materials discovery and the prediction of PES-derived properties such as elastic constants, phonon dispersion, etc.
However, most current uMLIPs were trained on DFT relaxation calculations from the Materials Project.
This dataset, referred to as MPF or MPTraj in the literature, suffer from several issues:
- The energies, forces, and stresses are not converged to the accuracies necessary to train a high quality MLIP.
- Most of the structures are near-equilibrium, with very little coverage of non-equilibrium local environments.
- The calculations were performed using the common Perdew-Burke-Ernzerhof (PBE) generalized gradient approximation (GGA) functional, even though improved functionals with better performance across diverse chemistries and bonding such as the strongly constrained and appropriately normed (SCAN) meta-GGA functional already exists.
MatPES is a continuing effort to address these limitations comprehensively.
Goals
The aims of MatPES are three-fold:
- Accuracy. The data in MatPES was computed using static DFT calculations with stringent converegence criteria.
Please refer to the
MatPESStaticSetin pymatgen for details. - Diversity. The structures in MatPES are robustly sampled from 300K MD simulations using the original M3GNet uMLIP.[^1] MatPES uses a modified version of DImensionality-Reduced Encoded Clusters with sTratified (DIRECT) sampling to ensure comprehensive coverage of structures and local environments.[^4]
- Quality. MatPES contains not only data computed using the PBE functional, but also the revised regularized SCAN (r2SCAN) meta-GGA functional. The r2SCAN functional recovers all 17 exact constraints presently known for meta-GGA functionals and has shown good transferable accuracy across diverse bonding and chemistries.
How to use MatPES
Download
You can download the entire MatPES dataset at MatPES.ai. (note: links are not functional until publication).
We have also provided a simple tool to extract subsets of the data, e.g., by elements or chemical system, via the
matpes package.
pip install matpes
Example code
Exploring
The MatPES Explorer provides a statistical visualization of the dataset.
Pre-trained uMLIPs
Together with MatPES, we have released a set of MatPES uMLIPs in various architectures (M3GNet, CHGNet, TensorNet) in the MatGL package. This is by far the easiest way to get started with using MatPES.
Notebooks
We have provided a series of Jupyter notebooks demonstrating how to load the MatPES dataset, train a model and perform finetuning.
Citing MatPES
If you use MatPES, please cite the following work:
Aaron Kaplan, Runze Liu, Ji Qi, Tsz Wai Ko, Bowen Deng, Gerbrand Ceder, Kristin A. Persson, Shyue Ping Ong.
A foundational potential energy surface dataset for materials. Submitted.
[^1]: Chen, C.; Ong, S. P. A Universal Graph Deep Learning Interatomic Potential for the Periodic Table. Nat Comput Sci 2022, 2 (11), 718-728. DOI: 10.1038/s43588-022-00349-3. [^2]: Deng, B.; Zhong, P.; Jun, K.; Riebesell, J.; Han, K.; Bartel, C. J.; Ceder, G. CHGNet as a Pretrained Universal Neural Network Potential for Charge-Informed Atomistic Modelling. Nat Mach Intell 2023, 5 (9), 1031-1041. DOI: 10.1038/s42256-023-00716-3. [^3]: Batatia, I.; Kovacs, D. P.; Simm, G.; Ortner, C.; Csanyi, G. MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields. Advances in Neural Information Processing Systems 2022, 35, 11423-11436. [^4]: Qi, J.; Ko, T. W.; Wood, B. C.; Pham, T. A.; Ong, S. P. Robust Training of Machine Learning Interatomic Potentials with Dimensionality Reduction and Stratified Sampling. npj Computational Materials 2024, 10 (43), 1-11. DOI: 10.1038/s41524-024-01227-4.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file matpes-0.0.1.tar.gz.
File metadata
- Download URL: matpes-0.0.1.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44d95e375bda9dd293ed3de747bdbec01d320e973aa752ebbd93b1b7e3f54b22
|
|
| MD5 |
170e85cf00b34dbe2a891b1201bec19a
|
|
| BLAKE2b-256 |
23c951aeac57009ce86b11609c5929bea360a38bece720e258023156e32eed4b
|
File details
Details for the file matpes-0.0.1-py3-none-any.whl.
File metadata
- Download URL: matpes-0.0.1-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89987de0858d0b70f5003f065a4ecb40229c8e5dab08b9b3899d3c5619bd35f6
|
|
| MD5 |
7da41ee359f47eea59daff792efd37a5
|
|
| BLAKE2b-256 |
ee383326524345cc4aa4763f7ff83e09f0e2dd5bc74ae88b53bd992235e546b1
|