Python bindings, utilities for PACE and fitting code "pacemaker"
Project description
pyace
pyace
is the python implementation of Atomic Cluster Expansion.
It provides the basis functionality for analysis, potentials conversion and fitting.
!!! THIS IS LIMITED FUNCTIONALITY VERSION OF pyace
!!!
Please, contact us by email yury.lysogorskiy@rub.de if you want to have fully-functional version
Installation
pip install pyace-lite
(optional) Installation of tensorpotential
If you want to use TensorFlow
implementation of atomic cluster expansion
(made by Dr. Anton Bochkarev), then contact us by email.
(!) Known issues
If you will encounter segmentation fault
errors, then try to upgrade the numpy
package with the command:
pip install --upgrade numpy --force
Directory structure
- lib/: contains the extra libraries for
pyace
- src/pyace/: bindings
Utilities
Potential conversion
There are two basic formats ACE potentials:
- B-basis set in YAML format, i.e. 'Al.pbe.yaml'. This is an internal developers complete format
- Ctilde-basis set in plain text format, i.e. 'Al.pbe.ace'. This format is irreversibly converted from B-basis set for public potentials distribution and is used by LAMMPS.
To convert potential you can use following utilities, that are installed together with pyace
package into you executable paths:
YAML
toace
:pace_yaml2ace
. Usage:
pace_yaml2ace [-h] [-o OUTPUT] input
Pacemaker
pacemaker
is an utility for fitting the atomic cluster expansion potential. Usage:
pacemaker [-h] [-o OUTPUT] [-p POTENTIAL] [-ip INITIAL_POTENTIAL]
[-b BACKEND] [-d DATA] [--query-data] [--prepare-data]
[-l LOG]
input
Fitting utility for atomic cluster expansion potentials
positional arguments:
input input YAML file
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
output B-basis YAML file name, default:
output_potential.yaml
-p POTENTIAL, --potential POTENTIAL
input potential YAML file name, will override input
file 'potential' section
-ip INITIAL_POTENTIAL, --initial-potential INITIAL_POTENTIAL
initial potential YAML file name, will override input
file 'potential::initial_potential' section
-b BACKEND, --backend BACKEND
backend evaluator, will override section
'backend::evaluator' from input file
-d DATA, --data DATA data file, will override section 'YAML:fit:filename'
from input file
--query-data query the training data from database, prepare and
save them
--prepare-data prepare and save training data only
-l LOG, --log LOG log filename (default log.txt)
The required settings are provided by input YAML file. The main sections
1. Cutoff and (optional) metadata
- Global cutoff for the fitting is setup as:
cutoff: 10.0
- Metadata (optional)
This is arbitrary key (string)-value (string) pairs that would be added to the potential YAML file:
metadata:
info: some info
comment: some comment
purpose: some purpose
Moreover, starttime
and user
fields would be added automatically
2.Dataset specification section
Fitting dataset could be queried automatically from structdb
(if corresponding structdborm
package is installed and
connection to database is configured, see structdb.ini
file in home folder). Alternatively, dataset could be saved into
file as a pickled pandas
dataframe with special names for columns: #TODO: add columns names
Example:
data: # dataset specification section
# data configuration section
config:
element: Al # element name
calculator: FHI-aims/PBE/tight # calculator type from `structdb`
# ref_energy: -1.234 # single atom reference energy
# if not specified, then it will be queried from database
# seed: 42 # random seed for shuffling the data
# query_limit: 1000 # limiting number of entries to query from `structdb`
# ignored if reading from cache
# parallel: 3 # number of parallel workers to preprocess the data, `pandarallel` package required
# if not specified, serial mode will be used
# cache_ref_df: True # whether to store the queried or modified dataset into file, default - True
# filename: some.pckl.gzip # force to read reference pickled dataframe from given file
# ignore_weights: False # whether to ignore energy and force weighting columns in dataframe
# datapath: ../data # path to folder with cache files with pickled dataframes
Alternatively, instead of data::config
section, one can specify just the cache file
with pickled dataframe as data::filename
:
data:
filename: small_df_tf_atoms.pckl
datapath: ../tests/
Example of creating the subselection of fitting dataframe and saving it is given in notebooks/data_preprocess.ipynb
Example of generating custom energy/forces weights is given in notebooks/data_custom_weights.ipynb
Querying data
You can just query and preprocess data, without running potential fitting. Here is the minimalistic input YAML:
# input.yaml file
cutoff: 10.0 # use larger cutoff to have excess neighbour list
data: # dataset specification section
config:
element: Al # element name
calculator: FHI-aims/PBE/tight # calculator type from `structdb`
seed: 42
parallel: 3 # parallel data processing. WARNING! higher memory usage is possible
datapath: ../data # path to the directory with cache files
# query_limit: 100 # number of entries to query
Then execute pacemaker --query-data input.yaml
to query and build the dataset with pyace
neighbour lists.
For building both pyace
and tensorpot
neighbour lists use pacemaker --query-data input.yaml -b tensorpot
Preparing the data / constructing neighbourlists
You can use existing .pckl.gzip
dataset and generate all necessary columns for that, including neighbourlists
Here is the minimalistic input YAML:
# input.yaml file
cutoff: 10.
data:
filename: my_dataset.pckl.gzip
backend:
evaluator: tensorpot # pyace, tensorpot
Then execute pacemaker --prepare-data input.yaml
Generation of the my_dataset.pckl.gzip
from, for example, pyiron is shown in notebooks/convert-pyiron-to-pacemaker.ipynb
3. Interatomic potential (or B-basis) configuration
One could define the initial interatomic potential configuration as:
potential:
deltaSplineBins: 0.001
element: Al
fs_parameters: [1, 1, 1, 0.5]
npot: FinnisSinclair
NameOfCutoffFunction: cos
rankmax: 3
nradmax: [ 4, 3, 3 ] # per-rank values of nradmax
lmax: [ 0, 1, 1 ] # per-rank values of lmax, lmax=0 for first rank always!
ndensity: 2
rcut: 8.7
dcut: 0.01
radparameters: [ 5.25 ]
radbase: ChebExpCos
##hard-core repulsion (optional)
# core-repulsion: [500, 10]
# rho_core_cut: 50
# drho_core_cut: 20
# basisdf: /some/path/to/pyace_bbasisfunc_df.pckl # path to the dataframe with "white list" of basis functions to use in fit
# initial_potential: whatever.yaml # in "ladder" fitting scheme, potential from with to start fit
If you want to continue fitting of the existing potential in potential.yaml
file, then specify
potential: potential yaml
Alternatively, one could use pacemaker ... -p potential.yaml
option
4. Fitting settings
Example of fit
section is:
fit:
loss: { kappa: 0, L1_coeffs: 0, L2_coeffs: 0, w1_coeffs: 0, w2_coeffs: 0,
w0_rad: 0, w1_rad: 0, w2_rad: 0 }
weighting:
type: EnergyBasedWeightingPolicy
nfit: 10000
cutoff: 10
DElow: 1.0
DEup: 10.0
DE: 1.0
DF: 1.0
wlow: 0.75
seed: 42
optimizer: BFGS # L-BFGS-B # Nelder-Mead
maxiter: 1000
# fit_cycles: 2 # (optional) number of consequentive runs of fitting algorithm,
# that helps convergence
# noise_relative_sigma: 1e-2 # applying Gaussian noise with specified relative sigma/mean ratio to all potential optimizable coefficients
# noise_absolute_sigma: 1e-3 # applying Gaussian noise with specified absolute sigma to all potential optimizable coefficients
# ladder_step: [10, 0.02] # Possible values:
# - integer >= 1 - number of basis functions to add in ladder scheme,
# - float between 0 and 1 - relative ladder step size wrt. current basis step
# - list of both above values - select maximum between two possibilities on each iteration
# see. Ladder scheme fitting for more info
# ladder_type: body_order # default
# Possible values:
# body_order - new basis functions are added according to the body-order, i.e., a function with higher body-order
# will not be added until the list of functions of the previous body-order is exhausted
# power_order - the order of adding new basis functions is defined by the "power rank" p of a function.
# p = len(ns) + sum(ns) + sum(ls). Functions with the smallest p are added first
If not specified, then uniform weight and energy-only fit (kappa=0), fit_cycles=1, noise_relative_sigma = 0 settings will be used.
5. Backend specification
backend:
evaluator: pyace # pyace, tensorpot
## for `pyace` evaluator, following options are available:
# parallel_mode: process # process, serial - parallelization mode for `pyace` evaluator
# n_workers: 4 # number of parallel workers for `process` parallelization mode
## for `tensorpot` evaluator, following options are available:
# batch_size: 10 # batch size for loss function evaluation, default is 10
# display_step: 20 # frequency of detailed metric calculation and printing
Alternatively, backend could be selected as pacemaker ... -b tensorpot
Ladder scheme fitting
In a ladder scheme potential fitting happens by adding new portion of basis functions step-by-step, to form a "ladder" from initial potential to final potential. Following settings should be added to the input YAML file:
- Specify final potential shape by providing
potential
section:
potential:
deltaSplineBins: 0.001
element: Al
fs_parameters: [1, 1, 1, 0.5]
npot: FinnisSinclair
NameOfCutoffFunction: cos
rankmax: 3
nradmax: [4, 1, 1]
lmax: [0, 1, 1]
ndensity: 2
rcut: 8.7
dcut: 0.01
radparameters: [5.25]
radbase: ChebExpCos
- Specify initial or intermediate potential by providing
initial_potential
option inpotential
section:
potential:
...
initial_potential: some_start_or_interim_potential.yaml # potential to start fit from
If initial or intermediate potential is not specified, then fit will start from empty potential. Alternatively, you can specify initial or intermediate potential by command-line option
pacemaker ... -ip some_start_or_interim_potential.yaml
- Specify
ladder_step
infit
section:
fit:
...
ladder_step: [10, 0.02] # Possible values:
# - integer >= 1 - number of basis functions to add in ladder scheme,
# - float between 0 and 1 - relative ladder step size wrt. current basis step
# - list of both above values - select maximum between two possibilities on each iteration
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file pyace-lite-0.0.1.5.tar.gz
.
File metadata
- Download URL: pyace-lite-0.0.1.5.tar.gz
- Upload date:
- Size: 48.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39f71af4cd217a47f31bf6bed78f7e9529fcdd162ab87e28cb53c7b9e771f037 |
|
MD5 | b74a357903fcc5a101e954214c2b4c3c |
|
BLAKE2b-256 | 7d140e3c76eadf40c2585003f82c7061a7b6c920ed53303743fdcae7eb71dab5 |
File details
Details for the file pyace_lite-0.0.1.5-cp39-cp39-manylinux2014_x86_64.whl
.
File metadata
- Download URL: pyace_lite-0.0.1.5-cp39-cp39-manylinux2014_x86_64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.9
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7cb46757dff7d7938dd638172a817215d4a37387d2ef57f8a03a4c64084ddda0 |
|
MD5 | 87b235782009e4e46c530f714489b205 |
|
BLAKE2b-256 | f3939789dea872e7b08be8d6c13382879de1051778cc07433fa45c77e5ab9151 |
File details
Details for the file pyace_lite-0.0.1.5-cp38-cp38-manylinux2014_x86_64.whl
.
File metadata
- Download URL: pyace_lite-0.0.1.5-cp38-cp38-manylinux2014_x86_64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.8
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4cbe6749f0304b394653a49c10a9d0944d2c343c2c8769eed5a732f2b085be1 |
|
MD5 | 4a6f8d733c0f034329afb6e143443aba |
|
BLAKE2b-256 | 4192a0becfcbfb28dff84b6322d317513b144767718ddd1db5057b6692d6d08d |
File details
Details for the file pyace_lite-0.0.1.5-cp37-cp37m-manylinux2014_x86_64.whl
.
File metadata
- Download URL: pyace_lite-0.0.1.5-cp37-cp37m-manylinux2014_x86_64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.7m
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87ef2d6a4a4664e632cba134278d7ed74183663c742bdc475397efc0b64bbe98 |
|
MD5 | bb4d126fba5bc5fbf14441ebd2c055de |
|
BLAKE2b-256 | 64efa97723ffa7f99842644720025edf53095192fe9f1cc432ed3960345487df |
File details
Details for the file pyace_lite-0.0.1.5-cp36-cp36m-manylinux2014_x86_64.whl
.
File metadata
- Download URL: pyace_lite-0.0.1.5-cp36-cp36m-manylinux2014_x86_64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.6m
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.58.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b955d4a39fbc36a84e1d845e40c97414579630f9c189d8215c311e83e1e438cb |
|
MD5 | ada68770d30ac71b3a2ab0e219bd1a1f |
|
BLAKE2b-256 | eaa00a4ed0f7d057a3e4af79c813c0b43cd322f45944d82f9a067111c1cdcd46 |