Chemeleon framework for De Novo Generation (DNG) and Crystal Structure Prediction (CSP) tasks
Project description
Chemeleon-DNG: Chemeleon for De Novo Generation
While Chemeleon GitHub repository focuses on text-guided crystal structure generation, this repository provides a framework for De Novo Generation (DNG) and Crystal Structure Prediction (CSP) tasks.
- CSP (Crystal Structure Prediction): Predicts stable crystal structures from given atom types
- DNG (De Novo Generation): Generates new crystal structures from scratch
Installation
Prerequisites
- Python 3.11+
- PyTorch >= 2.1.0
- CUDA (optional, for GPU acceleration)
Install via pip
pip install chemeleon-dng
Install from Source
If you don't have uv installed:
curl -LsSf https://astral.sh/uv/install.sh | sh
Then install the package:
git clone https://github.com/hspark1212/chemeleon-dng.git
cd chemeleon-dng
uv sync
Quick Start
Crystal Structure Prediction (CSP)
Generate crystal structures for given chemical formulas:
from chemeleon_dng.sample import sample
sample(
task="csp",
formulas=["NaCl", "LiMnO2"],
num_samples=10,
output_dir="results",
device="cpu"
)
[!TIP] Invoke
help(sample)to explore all available parameters and usage examples.
Command-Line Interface
After installing via pip, you can use the chemeleon-dng command directly:
chemeleon-dng --task=csp --formulas="NaCl,LiMnO2" --num_samples=10 --output_dir="results" --device=cpu
This command generates 10 crystal structures for the given formulas using the CSP task and saves the CIF files of the generated structures in the results/ directory using CPU.
De Novo Generation (DNG)
Generate novel crystal structures without predefined compositions:
from chemeleon_dng.sample import sample
sample(
task="dng",
num_samples=200,
batch_size=100,
output_dir="results",
device="cuda"
)
For the command line interface:
chemeleon-dng --task=dng --num_samples=200 --batch_size=100 --output_dir="results" --device=cuda
This command generates 200 random crystal structures using the DNG task with two batches of 100 each, and saves the generated structures in the results/ directory using GPU.
Tutorial
For a comprehensive step-by-step guide on using Chemeleon-DNG for crystal structure discovery, check out our interactive tutorial:
- Composition screening with SMACT
- Crystal structure generation with chemeleon-DNG
- Geometry optimization with MACE force fields and TorchSim
- Stability analysis using Materials Project phase diagrams via mp-api
Pretrained Models
When you run the sample script, it will automatically download the pretrained models from the figshare repository and save them in the ckpts/ directory (if not already present). The pretrained models were trained on mp-20 and alex_mp_20 datasets.
The framework includes pretrained checkpoints located in the ckpts/ directory:
chemeleon_csp_alex_mp_20_v0.0.2.ckptchemeleon_dng_alex_mp_20_v0.0.2.ckptchemeleon_csp_mp_20_v0.0.2.ckptchemeleon_dng_mp_20_v0.0.2.ckpt
Troubleshooting Checkpoint Downloads
If automatic download fails (timeout, connection error, or firewall restrictions), manually download the checkpoints:
# Download from Figshare
wget https://ndownloader.figshare.com/files/54966305 -O checkpoints.tar.gz
# Extract to project root
tar -xzf checkpoints.tar.gz
# Verify
ls ckpts/
For checkpoints in custom locations, use the model_path parameter:
sample(task="csp", formulas=["NaCl"], model_path="/path/to/checkpoint.ckpt")
Benchmarks
For benchmarking purposes, we provide 10,000 sampled structures for the DNG task trained on mp-20 and alex_mp_20 datasets in the benchmarks/ directory. The sampled structures are saved in CIF format and compressed JSON format.
Citation
If you find our work helpful, please cite the following publication:
"Exploration of crystal chemical space using text-guided generative artificial intelligence" Nature Communications (2025)
DOI: 10.1038/s41467-025-59636-y
@article{park2025exploration,
title={Exploration of crystal chemical space using text-guided generative artificial intelligence},
author={Park, Hyunsoo and Onwuli, Anthony and Walsh, Aron},
journal={Nature Communications},
volume={16},
number={1},
pages={1--14},
year={2025},
publisher={Nature Publishing Group}
}
License
This project is licensed under the MIT License, developed by Hyunsoo Park as part of the Materials Design Group at Imperial College London.
See the LICENSE file for more details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chemeleon_dng-0.1.5.tar.gz.
File metadata
- Download URL: chemeleon_dng-0.1.5.tar.gz
- Upload date:
- Size: 29.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
771b8b2039acf3b7933b6ef112a40627a27eae90925b469372655b003307ed1a
|
|
| MD5 |
a10fb8cacac258b8585d837940ef839a
|
|
| BLAKE2b-256 |
e3bd6e02c13d17bf63505c2a2fb605dc74339c6570130e14e711675922b6500e
|
File details
Details for the file chemeleon_dng-0.1.5-py3-none-any.whl.
File metadata
- Download URL: chemeleon_dng-0.1.5-py3-none-any.whl
- Upload date:
- Size: 30.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
103687524c77d196a19d486ec1cc4e8600f2193b12d68e791fd11e6ef324ef81
|
|
| MD5 |
7028e1f797053be5dee41158f717d369
|
|
| BLAKE2b-256 |
88b61090a666417a870b5ac483f28d7d4955c188530cddb8a620c9857421ce7a
|