Bayesian flow network framework for Chemistry

These details have not been verified by PyPI

Project links

Project description

ChemBFN: Bayesian Flow Network for Chemistry

This is the repository of the PyTorch implementation of ChemBFN model.

Build State

Features

ChemBFN provides the state-of-the-art functionalities of

SMILES or SELFIES-based de novo molecule generation
Protein sequence de novo generation
Template optimisation (mol2mol)
Classifier-free guidance conditional generation (single or multi-objective optimisation)
Context-guided conditional generation (inpaint)
Outstanding out-of-distribution chemical space sampling
Fast sampling via ODE solver
Molecular property and activity prediction finetuning
Reaction yield prediction finetuning

in an all-in-one-model style.

News

[26/12/2025] We were invited to submit a short report about ChemBFN for CICSJ Bulletin.
[09/10/2025] A web app chembfn_webui for hosting ChemBFN models is available on PyPI.
[30/01/2025] The package bayesianflow_for_chem is available on PyPI.
[21/01/2025] Our first paper has been accepted by JCIM.
[17/12/2024] The second paper of out-of-distribution generation is available on arxiv.org.
[31/07/2024] Paper is available on arxiv.org.
[21/07/2024] Paper was submitted to arXiv.

Install

$ pip install -U bayesianflow_for_chem

Usage

You can find example scripts in 📁example folder.

Pre-trained Model

You can find pretrained models (linked to pretraining datasets) on our 🤗Hugging Face model page.

Dataset Handling

We provide a Python class CSVData to handle data stored in CSV or similar format containing headers to identify the entities. The following is a quickstart.

Download your dataset file (e.g., ESOL from MoleculeNet) and split the file:

>>> from bayesianflow_for_chem.tool import split_data

>>> split_data("delaney-processed.csv", method="scaffold")

Load the split data:

>>> from bayesianflow_for_chem.data import smiles2token, collate, CSVData

>>> dataset = CSVData("delaney-processed_train.csv")
>>> dataset[0]
{'Compound ID': ['Thiophene'], 
'ESOL predicted log solubility in mols per litre': ['-2.2319999999999998'], 
'Minimum Degree': ['2'], 
'Molecular Weight': ['84.14299999999999'], 
'Number of H-Bond Donors': ['0'], 
'Number of Rings': ['1'], 
'Number of Rotatable Bonds': ['0'], 
'Polar Surface Area': ['0.0'], 
'measured log solubility in mols per litre': ['-1.33'], 
'smiles': ['c1ccsc1']}

Create a mapping function to tokenise the dataset and select values:

>>> import torch

>>> def encode(x):
...   smiles = x["smiles"][0]
...   value = [float(i) for i in x["measured log solubility in mols per litre"]]
...   return {"token": smiles2token(smiles), "value": torch.tensor(value)}

>>> dataset.map(encode)
>>> dataset[0]
{'token': tensor([  1, 151,  23, 151, 151, 154, 151,  23,   2]), 
'value': tensor([-1.3300])}

Wrap the dataset in torch.utils.data.DataLoader:

>>> dataloader = torch.utils.data.DataLoader(dataset, 32, collate_fn=collate)

Cite This Work

@article{2025chembfn,
    title={Bayesian Flow Network Framework for Chemistry Tasks},
    author={Tao, Nianze and Abe, Minori},
    journal={Journal of Chemical Information and Modeling},
    volume={65},
    number={3},
    pages={1178-1187},
    year={2025},
    doi={10.1021/acs.jcim.4c01792},
}

@article{2025chembfn_report,
    title={Molecular Structure Design via Bayesian Flow Network},
    author={Tao, Nianze and Nagai, Touma and Abe, Minori},
    journal={CICSJ Bulletin},
    volume={43},
    number={1},
    pages={10-14},
    year={2025},
    doi={10.11546/cicsj.43.10},
}

Out-of-distribution generation and fast sampling:

@misc{2024chembfn_ood,
    title={Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces}, 
    author={Nianze Tao},
    year={2024},
    eprint={2412.11439},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    url={https://arxiv.org/abs/2412.11439}, 
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.0.0

Apr 13, 2026

2.4.4

Feb 6, 2026

2.4.3

Jan 20, 2026

2.4.2

Jan 13, 2026

2.4.1

Jan 13, 2026

This version

2.4.0

Dec 29, 2025

2.3.5

Dec 26, 2025

2.3.4

Dec 17, 2025

2.3.3

Dec 11, 2025

2.3.2

Nov 24, 2025

2.3.1

Nov 19, 2025

2.3.0

Nov 18, 2025

2.2.6

Nov 16, 2025

2.2.5

Nov 8, 2025

2.2.4

Oct 27, 2025

2.2.3

Oct 20, 2025

2.2.2

Oct 17, 2025

2.2.1

Oct 16, 2025

2.1.0

Oct 14, 2025

2.0.5

Oct 13, 2025

2.0.4

Sep 29, 2025

2.0.3

Sep 28, 2025

2.0.2

Sep 27, 2025

2.0.1

Sep 23, 2025

2.0.0

Sep 23, 2025

1.4.3

Aug 22, 2025

1.4.2

Aug 9, 2025

1.4.1

Jul 17, 2025

1.4.0

Jul 11, 2025

1.3.0

Jun 26, 2025

1.2.7

May 8, 2025

1.2.6

Feb 16, 2025

1.2.1

Feb 10, 2025

1.2.0

Jan 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bayesianflow_for_chem-2.4.0.tar.gz (53.7 kB view details)

Uploaded Dec 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bayesianflow_for_chem-2.4.0-py3-none-any.whl (50.4 kB view details)

Uploaded Dec 29, 2025 Python 3

File details

Details for the file bayesianflow_for_chem-2.4.0.tar.gz.

File metadata

Download URL: bayesianflow_for_chem-2.4.0.tar.gz
Upload date: Dec 29, 2025
Size: 53.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for bayesianflow_for_chem-2.4.0.tar.gz
Algorithm	Hash digest
SHA256	`451b58a9fd363ebf7e935dadf2598bebb6564c9e425203594ddb3337ab81a791`
MD5	`15ef8fbed4284ae7661ac7a88d148d1b`
BLAKE2b-256	`35ed1230653f8829c50e0449851b35fdc7b1235ff23dfa35066b1d0f7124c407`

See more details on using hashes here.

File details

Details for the file bayesianflow_for_chem-2.4.0-py3-none-any.whl.

File metadata

Download URL: bayesianflow_for_chem-2.4.0-py3-none-any.whl
Upload date: Dec 29, 2025
Size: 50.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for bayesianflow_for_chem-2.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fb14765d8952063b4ca4bb9ff1f54785a1c634cb4ff82e2e535381c4e8a4c9bf`
MD5	`22946df161bfea4c5c2fab071208a9e1`
BLAKE2b-256	`78f925bbf827c4f84c57831cd0f874e6ed58ab3100c56df3f91ba0fafa709af3`

See more details on using hashes here.

bayesianflow-for-chem 2.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ChemBFN: Bayesian Flow Network for Chemistry

Build State

Features

News

Install

Usage

Pre-trained Model

Dataset Handling

Cite This Work

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes