GPR

These details have not been verified by PyPI

Project description

On-the-Fly Atomistic Calculator

This is an on-the-fly atomistic calculator based on Gaussian Process Regression (GPR), designed as an add-on calculator for the Atomic Simulation Environment (ASE). It employs a hybrid approach, combining:

Base calculator: A high-fidelity ab initio calculator (e.g., DFT) that serves as the reference ("gold standard") for accurate but computationally expensive calculations.
Surrogate GPR calculator: A computationally inexpensive model trained on-the-fly to approximate the base calculator, significantly accelerating simulations.

Motivation

Many atomistic simulations, such as geometry optimizations, barrier calculations (e.g., NEB), molecular dynamics (MD), and equation-of-state (EOS) simulations, require sampling a large number of atomic configurations within a relatively small phase space. Density Functional Theory (DFT) is frequently used for these calculations, offering a balance between accuracy and computational cost. However, DFT calculations can still be prohibitively expensive, especially for large systems or long simulation times. The GPR calculator aims to alleviate this computational bottleneck. At the moment, we focus on the NEB simulation only.

How It Works?

The GPR calculator operates through an iterative learning and prediction process:

Initial Data Collection: The simulation begins by using the base calculator (e.g., DFT) to calculate energies and forces for a small set of initial atomic configurations. These data serve as the initial training set for the GPR model.
Surrogate Model Training: A Gaussian Process Regression (GPR) model is trained using the data obtained from the base calculator. This model learns to predict the energy and forces of new atomic configurations based on the existing training data.
On-the-Fly Prediction and Uncertainty Quantification: During the simulation, for each new atomic configuration, the GPR model predicts the energy and forces, along with an estimate of the uncertainty in these predictions.
Choice of Calculator: The GPR calculator adaptively decides whether to use the GPR model's prediction or to invoke the base calculator based on the predicted uncertainty.
- If the uncertainty is below a user-defined threshold, the GPR model's prediction is used, saving computational cost.
- If the uncertainty is above the threshold, the base calculator is invoked to obtain a more accurate calculation. The new data point is then added to the training set, and the GPR model is retrained.
Iterative Refinement: Steps 3 and 4 are repeated throughout the simulation. As the GPR model accumulates more data, its accuracy improves, leading to a greater reliance on the surrogate model and a reduction in the number of calls to the base calculator.

This adaptive strategy allows the GPR calculator to achieve a balance between accuracy and computational efficiency, by using the computationally expensive base calculator only when necessary.

Installation

pip install .

A quick example

Below illustrates an example to run NEB calculation with the hybrid calculator

from ase.calculators.emt import EMT
from gpr_calc.gaussianprocess import GP
from gpr_calc.calculator import GPR
from gpr_calc.NEB import neb_calc, init_images, neb_plot_path

# Set parameters
init = 'examples/database/initial.traj'
final = 'examples/database/final.traj'
num_images = 5
fmax = 0.05

# Run NEB with EMT calculator
images = init_images(init, final, num_images)
images, engs, steps = neb_calc(images, EMT(), fmax=fmax)
data = [(images, engs, f'EMT ({steps*(len(images)-2)+2})')]

# Run NEB with gpr calculators
for etol in [0.02, 0.1, 0.2]:
    images = init_images(init, final, num_images)

    # initialize GPR model
    gp_model = GP.set_GPR(images, EMT(),
                          noise_e=etol/len(images[0]),
                          noise_f=0.1)
    # Set GPR calculator
    calc = GPR(base=EMT(), ff=gp_model, save=False)

    # Run NEB calculation
    images, engs, _ = neb_calc(images, calc, fmax=fmax)
    N_calls = gp_model.count_use_base
    data.append((images, engs, f'GPR-{etol:.2f} ({N_calls})'))
    print(gp_model, '\n\n')

neb_plot_path(data, figname='NEB-test.png')

This example demonstrates NEB calculations using:

A pure EMT calculator (for rapid testing).
A GPR calculator with EMT as the base and an uncertainty threshold of 0.02 eV per structure.
A GPR calculator with EMT as the base and an uncertainty threshold of 0.10 eV per structure.
A GPR calculator with EMT as the base and an uncertainty threshold of 0.20 eV per structure.

The output below illustrates the NEB calculation process. Initially, the base calculator (EMT) is used to generate reference data. These data points are then added to the GPR model. As the model learns and becomes more accurate, predictions from the surrogate model are used more frequently, reducing the computational cost.

Calculate E/F for image 0: 3.314754
Calculate E/F for image 1: 3.727147
Calculate E/F for image 2: 4.219952
Calculate E/F for image 3: 3.724974
Calculate E/F for image 4: 3.316117
------Gaussian Process Regression (0/2)------
Kernel: 1.00000**2 *RBF(length=0.10000) 5 energy (0.00769) 15 forces (0.10000)

Update GP model => 20
Loss:       -2.916  1.000  0.100
Loss:     1821.480  0.010 10.000
Loss:       48.811  0.527  4.827
Loss:      -51.328  0.835  1.750
Loss:      -52.140  0.898  1.717
Loss:      -53.035  1.025  1.634
Loss:      -53.163  1.078  1.589
From Surrogate ,  E: 0.054/0.100/3.729, F: 0.102/0.120/1.660
From Surrogate ,  E: 0.066/0.100/4.215, F: 0.155/0.120/3.489
From Surrogate ,  E: 0.054/0.100/3.725, F: 0.102/0.120/1.651
      Step     Time          Energy          fmax
BFGS:    0 20:02:22        4.214882        3.489110
From Surrogate ,  E: 0.054/0.100/3.647, F: 0.103/0.120/1.284
From Surrogate ,  E: 0.066/0.100/3.918, F: 0.154/0.120/2.609
From Surrogate ,  E: 0.053/0.100/3.644, F: 0.103/0.120/1.278
BFGS:    1 20:02:24        3.917948        2.608887
From Base model , E: 0.053/3.500/3.546, F: 0.182/0.350/0.400
From Base model , E: 0.100/3.512/3.738, F: 0.315/0.423/0.434
From Base model , E: 0.053/3.499/3.545, F: 0.183/0.349/0.399
BFGS:    2 20:02:29        3.737970        0.488517
...
...
...
From Surrogate ,  E: 0.095/0.200/3.694, F: 0.062/0.120/0.039
From Surrogate ,  E: 0.081/0.200/3.526, F: 0.097/0.120/0.391
BFGS:   41 20:08:35        3.693717        0.039118
From Surrogate ,  E: 0.082/0.200/3.344, F: 0.069/0.120/0.041
From Surrogate ,  E: 0.082/0.200/3.346, F: 0.067/0.120/0.048
------Gaussian Process Regression (0/2)------
Kernel: 2.80314**2 *RBF(length=1.52921) 7 energy (0.01538) 55 forces (0.10000)
Total base/surrogate/gpt_fit calls: 22/106/4

The trained GPR model is saved as:

A json file containing the model parameters.
A db file storing the training structures, energies, and forces.

The following figure illustrates a typical NEB path calculated using the GPR calculator:

NEB Path

The results demonstrate that the GPR calculator, even with a limited number of base calculator calls, can achieve reasonable accuracy compared to the reference calculation. A smaller energy uncertainty threshold (etol) generally leads to more accurate results. In scenarios where each base calculator call is computationally expensive, the GPR calculator offers a significant speedup without sacrificing accuracy. This approach is particularly beneficial for handling computationally demanding tasks such as large-scale NEB calculations for surface diffusion and reaction studies.

In addition, the trained model can be reused as follows if you want to restart the previously unconverged calculation.

from gpr_calc.gaussianprocess import GPR

gpr = GPR.load('test-RBF-gpr.json')
print(gpr)
gpr.validate_data(show=True)
gpr.fit(opt)

For more productive examples using VASP as the base calculator, please check the Examples.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.0.7

Apr 9, 2025

0.0.6

Apr 9, 2025

This version

0.0.4

Apr 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpr_calc-0.0.4.tar.gz (53.4 kB view details)

Uploaded Apr 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gpr_calc-0.0.4-py3-none-any.whl (56.2 kB view details)

Uploaded Apr 9, 2025 Python 3

File details

Details for the file gpr_calc-0.0.4.tar.gz.

File metadata

Download URL: gpr_calc-0.0.4.tar.gz
Upload date: Apr 9, 2025
Size: 53.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for gpr_calc-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`561f9faaf40fe20f6f9bb3cb03052feb2386f1743de1be4552d6e1108849f82d`
MD5	`d2785e430b0cca3b78fdab811cc296e6`
BLAKE2b-256	`3b495ad2dcb521bdebaccc4de1f8e21f361f748daa3fb605db1b25860970f1f1`

See more details on using hashes here.

File details

Details for the file gpr_calc-0.0.4-py3-none-any.whl.

File metadata

Download URL: gpr_calc-0.0.4-py3-none-any.whl
Upload date: Apr 9, 2025
Size: 56.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for gpr_calc-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6c4eca7adabacca80220efe203978311d398c8925a0834d8875cbe40fe8fc5c2`
MD5	`e2fb4725bc4e3796b958d20efa6f3146`
BLAKE2b-256	`1ef0a6d1412a7f5a22aa31fb9503ae3663bd96b2f4164d535be994cb3feeff0f`

See more details on using hashes here.

gpr-calc 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

On-the-Fly Atomistic Calculator

Motivation

How It Works?

Installation

A quick example

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes