Skip to main content

Reaction fingerprint

Project description

SynRFP — Graph-based, mapping-free reaction fingerprints

PyPI version Release Last Commit ZENODO CI Stars

SynRFP (Synthesis Reaction FingerPrint) is a mapping-free, permutation-invariant framework for representing chemical transformations as fixed-length fingerprints.
It explicitly factorises the fingerprint pipeline into three modular operators:

  • Ψ (extractor) — isomorphism-invariant subgraph/token extraction from each reaction side;
  • Φ (combination) — algebraic reparameterisation producing the signed net change Δ and total counts U per token;
  • 𝒮 (sketcher) — randomized compression of (Δ,U) into a fixed-dimensional fingerprint f ∈ 𝓕.

SynRFP Workflow


⚙️ Installation

# 1) Clone the repository
git clone https://github.com/TieuLongPhan/synrfp.git
cd synrfp

# 2) Install the package (with optional extras)
pip install .                  # core functionality
pip install .[all]             # with datasketch and pynauty support

or can install via pip

pip install synrfp

🔧 Quick Start

1. Single‐reaction fingerprint

from synrfp.graph.reaction import Reaction
from synrfp import SynRFP
from synrfp.tokenizers.wl import WLTokenizer
from synrfp.sketchers.parity_fold import ParityFold

# Parse RSMI into GraphData
reactant_G, product_G = Reaction.from_rsmi("CCO>>C=C.O")

# Build engine: WL at radius 1 + 1024-bit parity-fold
fp_engine = SynRFP(
    tokenizer=WLTokenizer(),
    radius=1,
    sketch=ParityFold(bits=1024, seed=42),
)

# Compute fingerprint
res = fp_engine.fingerprint(reactant_G, product_G)
print(res)               # SynRFPResult(tokens_R=3 tokens, tokens_P=3 tokens, support=0, sketch_type=bytearray)
bits = res.to_binary()   # [0,1,0,0, …]

2. One‐line wrapper

from synrfp import synrfp

# Generate a 1024-bit binary fingerprint in one call
bits = synrfp(
    "CCO>>C=C.O",
    tokenizer="wl",
    radius=1,
    sketch="parity",
    bits=1024,
    seed=42,
)
print(len(bits), bits[:16])  # e.g. 1024 [0, 1, 0, 0, …]

3. Batch encoding

from synrfp import BatchEncoder

rxn_smiles = [
    "CO.O[C@@H]1CCNC1.[C-]#[N+]CC(=O)OC>>[C-]#[N+]CC(=O)N1CC[C@@H](O)C1",
    "CCOC(=O)C(CC)c1cccnc1.Cl.O>>CCC(C(=O)O)c1cccnc1",
]

# Encode two reactions into a 2×1024 array of bits
fps = BatchEncoder.encode(
    rxn_smiles,
    tokenizer="wl",
    radius=1,
    sketch="parity",
    bits=1024,
    seed=42,
    batch_size=2
)

print(fps.shape)    # (2, 1024)
print(fps[0][:16])  # first 16 bits of the first fingerprint

Contributing

License

This project is licensed under MIT License - see the License file for details.

Acknowledgments

This project has received funding from the European Unions Horizon Europe Doctoral Network programme under the Marie-Skłodowska-Curie grant agreement No 101072930 (TACsy -- Training Alliance for Computational)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synrfp-0.0.3.tar.gz (7.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synrfp-0.0.3-py3-none-any.whl (49.6 kB view details)

Uploaded Python 3

File details

Details for the file synrfp-0.0.3.tar.gz.

File metadata

  • Download URL: synrfp-0.0.3.tar.gz
  • Upload date:
  • Size: 7.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synrfp-0.0.3.tar.gz
Algorithm Hash digest
SHA256 dd7cfad370f2f2b268ccf57ee3b381aad2a65990d0f612f3ae4066effe622fbd
MD5 a64a518135e66da6f27602c20200fd5f
BLAKE2b-256 9670e4df7370bae3f198e5b2cb293e29281e20c442680b7d047fa4f32d4022b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for synrfp-0.0.3.tar.gz:

Publisher: publish-package.yml on TieuLongPhan/SynRFP

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file synrfp-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: synrfp-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 49.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synrfp-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 580f1322aebf867fe9c6e8b0702ec3eb000d09ed6d599a4c8e94bda1345cb82b
MD5 43dd98f45585dcd69ecaaa9562de39fe
BLAKE2b-256 e062294492937d4360bf50cf6d1efe8b1fe4acaaac0a4996210d621781af8ed2

See more details on using hashes here.

Provenance

The following attestation bundles were made for synrfp-0.0.3-py3-none-any.whl:

Publisher: publish-package.yml on TieuLongPhan/SynRFP

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page