Reaction fingerprint
Project description
SynRFP — Graph-based, mapping-free reaction fingerprints
SynRFP (Synthesis Reaction FingerPrint) is a mapping-free, permutation-invariant framework for representing chemical transformations as fixed-length fingerprints.
It explicitly factorises the fingerprint pipeline into three modular operators:
- Ψ (extractor) — isomorphism-invariant subgraph/token extraction from each reaction side;
- Φ (combination) — algebraic reparameterisation producing the signed net change Δ and total counts U per token;
- 𝒮 (sketcher) — randomized compression of (Δ,U) into a fixed-dimensional fingerprint f ∈ 𝓕.
⚙️ Installation
# 1) Clone the repository
git clone https://github.com/TieuLongPhan/synrfp.git
cd synrfp
# 2) Install the package (with optional extras)
pip install . # core functionality
pip install .[all] # with datasketch and pynauty support
or can install via pip
pip install synrfp
🔧 Quick Start
1. Single‐reaction fingerprint
from synrfp.graph.reaction import Reaction
from synrfp import SynRFP
from synrfp.tokenizers.wl import WLTokenizer
from synrfp.sketchers.parity_fold import ParityFold
# Parse RSMI into GraphData
reactant_G, product_G = Reaction.from_rsmi("CCO>>C=C.O")
# Build engine: WL at radius 1 + 1024-bit parity-fold
fp_engine = SynRFP(
tokenizer=WLTokenizer(),
radius=1,
sketch=ParityFold(bits=1024, seed=42),
)
# Compute fingerprint
res = fp_engine.fingerprint(reactant_G, product_G)
print(res) # SynRFPResult(tokens_R=3 tokens, tokens_P=3 tokens, support=0, sketch_type=bytearray)
bits = res.to_binary() # [0,1,0,0, …]
2. One‐line wrapper
from synrfp import synrfp
# Generate a 1024-bit binary fingerprint in one call
bits = synrfp(
"CCO>>C=C.O",
tokenizer="wl",
radius=1,
sketch="parity",
bits=1024,
seed=42,
)
print(len(bits), bits[:16]) # e.g. 1024 [0, 1, 0, 0, …]
3. Batch encoding
from synrfp import BatchEncoder
rxn_smiles = [
"CO.O[C@@H]1CCNC1.[C-]#[N+]CC(=O)OC>>[C-]#[N+]CC(=O)N1CC[C@@H](O)C1",
"CCOC(=O)C(CC)c1cccnc1.Cl.O>>CCC(C(=O)O)c1cccnc1",
]
# Encode two reactions into a 2×1024 array of bits
fps = BatchEncoder.encode(
rxn_smiles,
tokenizer="wl",
radius=1,
sketch="parity",
bits=1024,
seed=42,
batch_size=2
)
print(fps.shape) # (2, 1024)
print(fps[0][:16]) # first 16 bits of the first fingerprint
Contributing
License
This project is licensed under MIT License - see the License file for details.
Acknowledgments
This project has received funding from the European Unions Horizon Europe Doctoral Network programme under the Marie-Skłodowska-Curie grant agreement No 101072930 (TACsy -- Training Alliance for Computational)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file synrfp-0.0.3.tar.gz.
File metadata
- Download URL: synrfp-0.0.3.tar.gz
- Upload date:
- Size: 7.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd7cfad370f2f2b268ccf57ee3b381aad2a65990d0f612f3ae4066effe622fbd
|
|
| MD5 |
a64a518135e66da6f27602c20200fd5f
|
|
| BLAKE2b-256 |
9670e4df7370bae3f198e5b2cb293e29281e20c442680b7d047fa4f32d4022b4
|
Provenance
The following attestation bundles were made for synrfp-0.0.3.tar.gz:
Publisher:
publish-package.yml on TieuLongPhan/SynRFP
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
synrfp-0.0.3.tar.gz -
Subject digest:
dd7cfad370f2f2b268ccf57ee3b381aad2a65990d0f612f3ae4066effe622fbd - Sigstore transparency entry: 729285103
- Sigstore integration time:
-
Permalink:
TieuLongPhan/SynRFP@f3878df64433cdfa4bab7de9dfadbf306bbb1c89 -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/TieuLongPhan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-package.yml@f3878df64433cdfa4bab7de9dfadbf306bbb1c89 -
Trigger Event:
release
-
Statement type:
File details
Details for the file synrfp-0.0.3-py3-none-any.whl.
File metadata
- Download URL: synrfp-0.0.3-py3-none-any.whl
- Upload date:
- Size: 49.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
580f1322aebf867fe9c6e8b0702ec3eb000d09ed6d599a4c8e94bda1345cb82b
|
|
| MD5 |
43dd98f45585dcd69ecaaa9562de39fe
|
|
| BLAKE2b-256 |
e062294492937d4360bf50cf6d1efe8b1fe4acaaac0a4996210d621781af8ed2
|
Provenance
The following attestation bundles were made for synrfp-0.0.3-py3-none-any.whl:
Publisher:
publish-package.yml on TieuLongPhan/SynRFP
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
synrfp-0.0.3-py3-none-any.whl -
Subject digest:
580f1322aebf867fe9c6e8b0702ec3eb000d09ed6d599a4c8e94bda1345cb82b - Sigstore transparency entry: 729285110
- Sigstore integration time:
-
Permalink:
TieuLongPhan/SynRFP@f3878df64433cdfa4bab7de9dfadbf306bbb1c89 -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/TieuLongPhan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-package.yml@f3878df64433cdfa4bab7de9dfadbf306bbb1c89 -
Trigger Event:
release
-
Statement type: