A Python package providing a workflow for descriptor embedding and clustering for atomistic environments.
Project description
Descriptor Embedding and Clustering for Atomisitic-environment Framework (DECAF)
https://gitlab.mpcdf.mpg.de/klai/decaf.git
Tutorials
For tutorials with examples, please visit the GitLab Pages https://klai.pages.mpcdf.de/decaf/
Description
This is a Python package which provide a work flow to obtain clustering of local environments in dataset of structures.
Please refer the methodology paper "A Fuzzy Classification Framework to Identify Equivalent Atoms in Complex Materials and Molecules"[1] for details.
It provides mainly the following functions:
- Computating SOAP descriptor from an input atomic structure as an ASE Atoms object.
- Applying classical multidimensional scaling (MDS) on a dataset of SOAP.
- Differnetiating atomic environments of the embeded dataset using mean shift clustering (MSC).
- Embedding and classifying environments outside of MDS-MSC dataset.
Optional functions are also provided:
- Applying kernel principal component analysis (kPCA) / principal component analysis (PCA) / Sketch-Map[2] for embedding
- Applying HDBSCAN[3] for clustering.
References linking the journal article[1] and the code
Here we provide the locations in the code implementing the corresponding methods in the article.[1]
For details about how to use each function, please refer to decaf/examples/sample_code.ipynb or comments in decaf/src/decaf.py.
Methodology involved in the main text[1]:
double-SOAP (Sec.2A[1]):decaf/src/decaf.py : function get_SOAP
classical MDS (Sec.2B[1]):decaf/src/decaf.py : function get_cMDS
embedding any SOAP vector with obtained modeldecaf/src/decaf.py : function embed_cMDS
MSC (Sec.2C[1]):decaf/src/decaf.py : function get_MeanShift
Demonstrations in the main text[1]:
PAH examples (Sec.3A[1]):decaf/examples/sample_code.ipynb : block PAH Example
Pd Surfaces examples (Sec.3B[1]):decaf/examples/sample_code.ipynb : block Pd Surfaces Demonstration
Out-of-sample classification of Pd nanoparticle (Sec.3C[1]):decaf/examples/sample_code.ipynb : block Classification Demonstration
Demonstrations in the supplementary information (SI)[1]:
kPCA (Sec.S1A[1]):decaf/examples/sample_code.ipynb : block kPCA Embedding
SketchMap (Sec.S1B[1]):decaf/examples/sample_code.ipynb : block Sketch Map Embedding
HDBSCAN (Sec.S2A[1]):decaf/examples/sample_code.ipynb : block HDBSCAN Clustering
Demonstration in Sec.S3-4 are reproducible with change in (hyper)parameters according to the SI with functions in:decaf/examples/sample_code.ipynb : block Pd Surfaces Demonstration
Demonstration in Sec.S5: MD settings and analysis are given in main text and reproducible, thus omitted in the example here.
Installation
You can install the package from PyPI with the following command
pip install fhi-decaf
For development from a local clone, use an editable install
pip install -e ".[test,docs]"
Then import the package with the following in Python
import decaf
Dependence:
Numpy, ASE, DScribe, Scikit Learn, Scipy
Repository Structure:
decaf
├── examples # Folder containing examples of applying DECAF
│ ├── Compiled_SketchMap # Folder containing compiled SketchMap if needed
│ ├── sample_code.ipynb # Sample code of DECAF applied on the demonstration cases
│ └── Structures # Folder containing atomic structures for the demonstration cases
│ └── **.con
├── pyproject.toml # Setup code for installing DECAF
├── README.md # The readme you are reading now.
└── src # Folder containing Source code of DECAF
└── decaf.py # Source code of DECAF
Reference
- K. C. Lai, S. Matera, C. Scheurer, K. Reuter, "A Fuzzy Classification Framework to Identify Equivalent Atoms in Complex Materials and Molecules" J. Chem. Phys 159.2 (2023). DOI: 10.1063/5.0160369 .
- M. Ceriotti, G. A. Tribello, and M. Parrinello, “Simplifying the representation of complex free-energy landscapes using sketch-map,” Proc. Natl. Acad. Sci. U.S.A. 108, 13023–13028 (2011).
- L. McInnes, J. Healy, and S. Astels, “hdbscan: Hierarchical density based clustering.” J. Open Source Softw. 2, 205 (2017).
Authors and Affiliation
Authors:
King Chun Lai, Sebastian Matera, Christoph Scheurer, Karsten Reuter
Affiliation:
Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany
Support
King Chun Lai : lai@fhi-berlin.mpg.de
License
Descriptor Embedding and Clustering for Atomisitic-environment Framework by King Chun Lai, Sebastian Matera, Christoph Scheurer, Karsten Reuter is licensed under CC BY 4.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fhi_decaf-1.0.6.tar.gz.
File metadata
- Download URL: fhi_decaf-1.0.6.tar.gz
- Upload date:
- Size: 5.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1fef3fe1256a292ac6b514e6e4c360cced57c1c6af859ec09d1d6e191bb7630d
|
|
| MD5 |
d6f6bc50ae9285e42496d93f0a4ffcde
|
|
| BLAKE2b-256 |
6835b54e82243e08b53949a47891e42a3c2bda48f3734c12f9103659924c346f
|
File details
Details for the file fhi_decaf-1.0.6-py3-none-any.whl.
File metadata
- Download URL: fhi_decaf-1.0.6-py3-none-any.whl
- Upload date:
- Size: 46.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1663eecc8cafe5bd4c6b75b3112fdaba7a6ea958752da94d11760fcf5d7b8777
|
|
| MD5 |
bcd431f9f96634614c7de6676cecb037
|
|
| BLAKE2b-256 |
fe6a9021fb53b536cba4cfdcb5339f176f8afb605469626a7f182213877e7c8c
|