Python bindings, utilities for PACE and fitting code "pacemaker"
Project description
pyace
pyace is the python implementation of Atomic Cluster Expansion.
It provides the basis functionality for analysis, potentials conversion and fitting.
!!! THIS IS LIMITED FUNCTIONALITY VERSION OF pyace !!!
Please, contact us by email yury.lysogorskiy@rub.de if you want to have fully-functional version
Installation
pip install pyace-lite
(optional) Installation of tensorpotential
If you want to use TensorFlow implementation of atomic cluster expansion
(made by Dr. Anton Bochkarev), then contact us by email.
(!) Known issues
If you will encounter segmentation fault errors, then try to upgrade the numpy package with the command:
pip install --upgrade numpy --force
Directory structure
- lib/: contains the extra libraries for
pyace - src/pyace/: bindings
Utilities
Potential conversion
There are two basic formats ACE potentials:
- B-basis set in YAML format, i.e. 'Al.pbe.yaml'. This is an internal developers complete format
- Ctilde-basis set in plain text format, i.e. 'Al.pbe.ace'. This format is irreversibly converted from B-basis set for public potentials distribution and is used by LAMMPS.
To convert potential you can use following utilities, that are installed together with pyace package into you executable paths:
YAMLtoace:pace_yaml2ace. Usage:
pace_yaml2ace [-h] [-o OUTPUT] input
Pacemaker
pacemaker is an utility for fitting the atomic cluster expansion potential. Usage:
pacemaker [-h] [-o OUTPUT] [-p POTENTIAL] [-ip INITIAL_POTENTIAL]
[-b BACKEND] [-d DATA] [--query-data] [--prepare-data]
[-l LOG]
input
Fitting utility for atomic cluster expansion potentials
positional arguments:
input input YAML file
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
output B-basis YAML file name, default:
output_potential.yaml
-p POTENTIAL, --potential POTENTIAL
input potential YAML file name, will override input
file 'potential' section
-ip INITIAL_POTENTIAL, --initial-potential INITIAL_POTENTIAL
initial potential YAML file name, will override input
file 'potential::initial_potential' section
-b BACKEND, --backend BACKEND
backend evaluator, will override section
'backend::evaluator' from input file
-d DATA, --data DATA data file, will override section 'YAML:fit:filename'
from input file
--query-data query the training data from database, prepare and
save them
--prepare-data prepare and save training data only
-l LOG, --log LOG log filename (default log.txt)
The required settings are provided by input YAML file. The main sections
1. Cutoff and (optional) metadata
- Global cutoff for the fitting is setup as:
cutoff: 10.0
- Metadata (optional)
This is arbitrary key (string)-value (string) pairs that would be added to the potential YAML file:
metadata:
info: some info
comment: some comment
purpose: some purpose
Moreover, starttime and user fields would be added automatically
2.Dataset specification section
Fitting dataset could be queried automatically from structdb (if corresponding structdborm package is installed and
connection to database is configured, see structdb.ini file in home folder). Alternatively, dataset could be saved into
file as a pickled pandas dataframe with special names for columns: #TODO: add columns names
Example:
data: # dataset specification section
# data configuration section
config:
element: Al # element name
calculator: FHI-aims/PBE/tight # calculator type from `structdb`
# ref_energy: -1.234 # single atom reference energy
# if not specified, then it will be queried from database
# seed: 42 # random seed for shuffling the data
# query_limit: 1000 # limiting number of entries to query from `structdb`
# ignored if reading from cache
# parallel: 3 # number of parallel workers to preprocess the data, `pandarallel` package required
# if not specified, serial mode will be used
# cache_ref_df: True # whether to store the queried or modified dataset into file, default - True
# filename: some.pckl.gzip # force to read reference pickled dataframe from given file
# ignore_weights: False # whether to ignore energy and force weighting columns in dataframe
# datapath: ../data # path to folder with cache files with pickled dataframes
Alternatively, instead of data::config section, one can specify just the cache file
with pickled dataframe as data::filename:
data:
filename: small_df_tf_atoms.pckl
datapath: ../tests/
Example of creating the subselection of fitting dataframe and saving it is given in notebooks/data_preprocess.ipynb
Example of generating custom energy/forces weights is given in notebooks/data_custom_weights.ipynb
Querying data
You can just query and preprocess data, without running potential fitting. Here is the minimalistic input YAML:
# input.yaml file
cutoff: 10.0 # use larger cutoff to have excess neighbour list
data: # dataset specification section
config:
element: Al # element name
calculator: FHI-aims/PBE/tight # calculator type from `structdb`
seed: 42
parallel: 3 # parallel data processing. WARNING! higher memory usage is possible
datapath: ../data # path to the directory with cache files
# query_limit: 100 # number of entries to query
Then execute pacemaker --query-data input.yaml to query and build the dataset with pyace neighbour lists.
For building both pyace and tensorpot neighbour lists use pacemaker --query-data input.yaml -b tensorpot
Preparing the data / constructing neighbourlists
You can use existing .pckl.gzip dataset and generate all necessary columns for that, including neighbourlists
Here is the minimalistic input YAML:
# input.yaml file
cutoff: 10.
data:
filename: my_dataset.pckl.gzip
backend:
evaluator: tensorpot # pyace, tensorpot
Then execute pacemaker --prepare-data input.yaml
Generation of the my_dataset.pckl.gzip from, for example, pyiron is shown in notebooks/convert-pyiron-to-pacemaker.ipynb
3. Interatomic potential (or B-basis) configuration
One could define the initial interatomic potential configuration as:
potential:
deltaSplineBins: 0.001
element: Al
fs_parameters: [1, 1, 1, 0.5]
npot: FinnisSinclair
NameOfCutoffFunction: cos
rankmax: 3
nradmax: [ 4, 3, 3 ] # per-rank values of nradmax
lmax: [ 0, 1, 1 ] # per-rank values of lmax, lmax=0 for first rank always!
ndensity: 2
rcut: 8.7
dcut: 0.01
radparameters: [ 5.25 ]
radbase: ChebExpCos
##hard-core repulsion (optional)
# core-repulsion: [500, 10]
# rho_core_cut: 50
# drho_core_cut: 20
# basisdf: /some/path/to/pyace_bbasisfunc_df.pckl # path to the dataframe with "white list" of basis functions to use in fit
# initial_potential: whatever.yaml # in "ladder" fitting scheme, potential from with to start fit
If you want to continue fitting of the existing potential in potential.yaml file, then specify
potential: potential yaml
Alternatively, one could use pacemaker ... -p potential.yaml option
4. Fitting settings
Example of fit section is:
fit:
loss: { kappa: 0, L1_coeffs: 0, L2_coeffs: 0, w1_coeffs: 0, w2_coeffs: 0,
w0_rad: 0, w1_rad: 0, w2_rad: 0 }
weighting:
type: EnergyBasedWeightingPolicy
nfit: 10000
cutoff: 10
DElow: 1.0
DEup: 10.0
DE: 1.0
DF: 1.0
wlow: 0.75
seed: 42
optimizer: BFGS # L-BFGS-B # Nelder-Mead
maxiter: 1000
# fit_cycles: 2 # (optional) number of consequentive runs of fitting algorithm,
# that helps convergence
# noise_relative_sigma: 1e-2 # applying Gaussian noise with specified relative sigma/mean ratio to all potential optimizable coefficients
# noise_absolute_sigma: 1e-3 # applying Gaussian noise with specified absolute sigma to all potential optimizable coefficients
# ladder_step: [10, 0.02] # Possible values:
# - integer >= 1 - number of basis functions to add in ladder scheme,
# - float between 0 and 1 - relative ladder step size wrt. current basis step
# - list of both above values - select maximum between two possibilities on each iteration
# see. Ladder scheme fitting for more info
# ladder_type: body_order # default
# Possible values:
# body_order - new basis functions are added according to the body-order, i.e., a function with higher body-order
# will not be added until the list of functions of the previous body-order is exhausted
# power_order - the order of adding new basis functions is defined by the "power rank" p of a function.
# p = len(ns) + sum(ns) + sum(ls). Functions with the smallest p are added first
If not specified, then uniform weight and energy-only fit (kappa=0), fit_cycles=1, noise_relative_sigma = 0 settings will be used.
5. Backend specification
backend:
evaluator: pyace # pyace, tensorpot
## for `pyace` evaluator, following options are available:
# parallel_mode: process # process, serial - parallelization mode for `pyace` evaluator
# n_workers: 4 # number of parallel workers for `process` parallelization mode
## for `tensorpot` evaluator, following options are available:
# batch_size: 10 # batch size for loss function evaluation, default is 10
# display_step: 20 # frequency of detailed metric calculation and printing
Alternatively, backend could be selected as pacemaker ... -b tensorpot
Ladder scheme fitting
In a ladder scheme potential fitting happens by adding new portion of basis functions step-by-step, to form a "ladder" from initial potential to final potential. Following settings should be added to the input YAML file:
- Specify final potential shape by providing
potentialsection:
potential:
deltaSplineBins: 0.001
element: Al
fs_parameters: [1, 1, 1, 0.5]
npot: FinnisSinclair
NameOfCutoffFunction: cos
rankmax: 3
nradmax: [4, 1, 1]
lmax: [0, 1, 1]
ndensity: 2
rcut: 8.7
dcut: 0.01
radparameters: [5.25]
radbase: ChebExpCos
- Specify initial or intermediate potential by providing
initial_potentialoption inpotentialsection:
potential:
...
initial_potential: some_start_or_interim_potential.yaml # potential to start fit from
If initial or intermediate potential is not specified, then fit will start from empty potential. Alternatively, you can specify initial or intermediate potential by command-line option
pacemaker ... -ip some_start_or_interim_potential.yaml
- Specify
ladder_stepinfitsection:
fit:
...
ladder_step: [10, 0.02] # Possible values:
# - integer >= 1 - number of basis functions to add in ladder scheme,
# - float between 0 and 1 - relative ladder step size wrt. current basis step
# - list of both above values - select maximum between two possibilities on each iteration
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyace-lite-0.0.1.5.tar.gz.
File metadata
- Download URL: pyace-lite-0.0.1.5.tar.gz
- Upload date:
- Size: 48.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39f71af4cd217a47f31bf6bed78f7e9529fcdd162ab87e28cb53c7b9e771f037
|
|
| MD5 |
b74a357903fcc5a101e954214c2b4c3c
|
|
| BLAKE2b-256 |
7d140e3c76eadf40c2585003f82c7061a7b6c920ed53303743fdcae7eb71dab5
|
File details
Details for the file pyace_lite-0.0.1.5-cp39-cp39-manylinux2014_x86_64.whl.
File metadata
- Download URL: pyace_lite-0.0.1.5-cp39-cp39-manylinux2014_x86_64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.9
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7cb46757dff7d7938dd638172a817215d4a37387d2ef57f8a03a4c64084ddda0
|
|
| MD5 |
87b235782009e4e46c530f714489b205
|
|
| BLAKE2b-256 |
f3939789dea872e7b08be8d6c13382879de1051778cc07433fa45c77e5ab9151
|
File details
Details for the file pyace_lite-0.0.1.5-cp38-cp38-manylinux2014_x86_64.whl.
File metadata
- Download URL: pyace_lite-0.0.1.5-cp38-cp38-manylinux2014_x86_64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.8
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e4cbe6749f0304b394653a49c10a9d0944d2c343c2c8769eed5a732f2b085be1
|
|
| MD5 |
4a6f8d733c0f034329afb6e143443aba
|
|
| BLAKE2b-256 |
4192a0becfcbfb28dff84b6322d317513b144767718ddd1db5057b6692d6d08d
|
File details
Details for the file pyace_lite-0.0.1.5-cp37-cp37m-manylinux2014_x86_64.whl.
File metadata
- Download URL: pyace_lite-0.0.1.5-cp37-cp37m-manylinux2014_x86_64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.7m
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87ef2d6a4a4664e632cba134278d7ed74183663c742bdc475397efc0b64bbe98
|
|
| MD5 |
bb4d126fba5bc5fbf14441ebd2c055de
|
|
| BLAKE2b-256 |
64efa97723ffa7f99842644720025edf53095192fe9f1cc432ed3960345487df
|
File details
Details for the file pyace_lite-0.0.1.5-cp36-cp36m-manylinux2014_x86_64.whl.
File metadata
- Download URL: pyace_lite-0.0.1.5-cp36-cp36m-manylinux2014_x86_64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.6m
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b955d4a39fbc36a84e1d845e40c97414579630f9c189d8215c311e83e1e438cb
|
|
| MD5 |
ada68770d30ac71b3a2ab0e219bd1a1f
|
|
| BLAKE2b-256 |
eaa00a4ed0f7d057a3e4af79c813c0b43cd322f45944d82f9a067111c1cdcd46
|