Skip to main content

Chemeleon framework for De Novo Generation (DNG) and Crystal Structure Prediction (CSP) tasks

Project description

Chemeleon-DNG: Chemeleon for De Novo Generation

While Chemeleon GitHub repository focuses on text-guided crystal structure generation, this repository provides a framework for De Novo Generation (DNG) and Crystal Structure Prediction (CSP) tasks.

  • CSP (Crystal Structure Prediction): Predicts stable crystal structures from given atom types
  • DNG (De Novo Generation): Generates new crystal structures from scratch

Installation

Prerequisites

  • Python 3.11+
  • PyTorch >= 2.1.0
  • CUDA (optional, for GPU acceleration)

Install via pip

pip install chemeleon-dng

Install from Source

If you don't have uv installed:

curl -LsSf https://astral.sh/uv/install.sh | sh

Then install the package:

git clone https://github.com/hspark1212/chemeleon-dng.git
cd chemeleon-dng
uv sync

Quick Start

Crystal Structure Prediction (CSP)

Generate crystal structures for given chemical formulas:

from chemeleon_dng.sample import sample

sample(
    task="csp",
    formulas=["NaCl", "LiMnO2"],
    num_samples=10,
    output_dir="results",
    device="cpu"
)

[!TIP] Invoke help(sample) to explore all available parameters and usage examples.

For the command line interface, you can use the following command:

python -m chemeleon_dng.sample --task=csp --formulas="NaCl,LiMnO2" --num_samples=10 --output_dir="results" --device=cpu

This command generates 10 crystal structures for the given formulas using the CSP task and saves the CIF files of the generated structures in the results/ directory using CPU.

De Novo Generation (DNG)

Generate novel crystal structures without predefined compositions:

from chemeleon_dng.sample import sample

sample(
    task="dng",
    num_samples=200,
    batch_size=100,
    output_dir="results",
    device="cuda"
)

For the command line interface, you can use the following command:

python -m chemeleon_dng.sample --task=dng --num_samples=200 --batch_size=100 --output_dir="results" --device=cuda

This command generates 200 random crystal structures using the DNG task with two batches of 100 each, and saves the generated structures in the results/ directory using GPU.

Pretrained Models

When you run the sample script, it will automatically download the pretrained models from the figshare repository and save them in the ckpts/ directory (if not already present). The pretrained models were trained on mp-20 and alex_mp_20 datasets.

The framework includes pretrained checkpoints located in the ckpts/ directory:

  • chemeleon_csp_alex_mp_20_v0.0.2.ckpt
  • chemeleon_dng_alex_mp_20_v0.0.2.ckpt
  • chemeleon_csp_mp_20_v0.0.2.ckpt
  • chemeleon_dng_mp_20_v0.0.2.ckpt

Benchmarks

For benchmarking purposes, we provide 10,000 sampled structures for the DNG task trained on mp-20 and alex_mp_20 datasets in the benchmarks/ directory. The sampled structures are saved in CIF format and compressed JSON format.

Citation

If you find our work helpful, please cite the following publication:

"Exploration of crystal chemical space using text-guided generative artificial intelligence" Nature Communications (2025)
DOI: 10.1038/s41467-025-59636-y

@article{park2025exploration,
  title={Exploration of crystal chemical space using text-guided generative artificial intelligence},
  author={Park, Hyunsoo and Onwuli, Anthony and Walsh, Aron},
  journal={Nature Communications},
  volume={16},
  number={1},
  pages={1--14},
  year={2025},
  publisher={Nature Publishing Group}
}

License

This project is licensed under the MIT License, developed by Hyunsoo Park as part of the Materials Design Group at Imperial College London.
See the LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chemeleon_dng-0.1.1.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chemeleon_dng-0.1.1-py3-none-any.whl (29.9 kB view details)

Uploaded Python 3

File details

Details for the file chemeleon_dng-0.1.1.tar.gz.

File metadata

  • Download URL: chemeleon_dng-0.1.1.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for chemeleon_dng-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0dc2d8796e3372ad6fdbd10837682579ddd58c8b2c2ee8816ae2829df170e84a
MD5 c18db1204b7ebc70d2edc8bff3bd46f6
BLAKE2b-256 fc67485e7ce6fa4d947c8b8dcea270bb8990eb901a72b0259eaaaaeff3e23b8f

See more details on using hashes here.

File details

Details for the file chemeleon_dng-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: chemeleon_dng-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for chemeleon_dng-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9d6426b1342ca58d0a5a45761a833050450d466434a12f438c2f2a122a0ef8fe
MD5 096a8c076faf69946e5ed70dd9942b6c
BLAKE2b-256 d63ce164641ccd8552032e4f99eeae9a92283406cc45af36601063500de20dfd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page