Hybrid-strict hierarchical reaction classification (LLM-derived taxonomy + deterministic template matching).
Project description
ReactionClassifier
Hierarchical reaction classification. Given a reaction SMILES, it predicts a class in an LLM-derived reaction taxonomy and confirms it symbolically: a Morgan difference–product (MDP) fingerprint MLP gate proposes a class, the exact retrosynthetic templates in that class's tier-3 subtree are applied to the reaction, and a label is returned only if a template reproduces the recorded product
Install
pip install reactionclassifier # requires rdkit, torch, numpy
Quickstart
from reactionclassifier import ReactionClassifier
clf = ReactionClassifier() # loads the bundled gate + templates + taxonomy
r = clf.classify("CC(=O)O.NCc1ccccc1>>CC(=O)NCc1ccccc1")
r.reaction_code # '2.1.2.1' (deterministically confirmed)
r.reaction_name # 'Amidation using Carboxylic Acids | Primary Amine + Carboxylic Acid to Secondary Amide'
r.tier_path # ['2.1', '2.1.2', '2.1.2.1']
r.confidence # Top-1 probability of the neural layer
classify() returns a ClassificationResult:
| field | meaning |
|---|---|
reaction_code |
the deterministically confirmed class code — a template fired and reproduced the product. None if unconfirmed. |
reaction_name |
pipe-separated level 3/4/5 names of reaction_code (tiers 1-2 omitted). |
neural_code / neural_name |
the neural-gate prediction (always available); use as a fallback when reaction_code is None. Same name format. |
confidence |
neural-gate softmax confidence |
tier_path |
ancestor codes of reaction_code |
So: if reaction_code is populated you have a verified label; otherwise
neural_code/neural_name give the model's best (unverified) guess.
Taxonomy and granularity examples
from reactionclassifier import load_taxonomy, name_for, full_class_name, tier_path, load_granularity
load_taxonomy()["1.3.6"] # -> single class name
name_for("2.1.2.1") # -> single-level name
full_class_name("2.1.2.1") # -> pipe-joined L3|L4|L5 names, e.g.
# 'Amidation using Carboxylic Acids | Primary Amine + Carboxylic Acid to Secondary Amide'
tier_path("1.3.6.2") # -> ['1.3', '1.3.6', '1.3.6.2']
load_granularity() # the two granularity-comparison tables
What's included
| Component | File |
|---|---|
| MDP-gate MLP (full-data, 6,962 classes) | data/gate/ |
| Exact rr0rp1_ring0 template library | data/class_to_templates.json |
| Full taxonomy (14,060 code→name) | data/taxonomy.json |
| Granularity examples (+ a small illustrative SMIRKS subset) | data/granularity_examples*.json |
Full reaction database
The full labelled reaction database (≈666k reactions) is hosted externally (Zenodo) rather than shipped in the wheel:
from reactionclassifier.database import download_database
path = download_database() # downloads + caches the parquet, returns its path
The released database excludes NameRXN-derived columns (NAME, CLASS);
NameRXN is proprietary and its labels are not redistributed.
Detail
- The MDP fingerprint is a Morgan difference (reactant⊕product bit-unions) concatenated with the product fingerprint (RDKit, radius 2, 2048 bits each; 4096-dim).
- The full generalised-SMIRKS library is not released; only the small subset embedded in the granularity examples is included.
License
MIT (code and bundled data). See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file reactionclassifier-0.1.0.tar.gz.
File metadata
- Download URL: reactionclassifier-0.1.0.tar.gz
- Upload date:
- Size: 22.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0c0067cd4018fbb855c398fd8b0f04e8dce22ef63b3665f729ebaf23e4bf241
|
|
| MD5 |
36f146716a471166f193aa5df10b16b3
|
|
| BLAKE2b-256 |
134d32f17e3a14ce57eceb9bc55ea81d9343a284c8a6ed3f30249593514f2e97
|
Provenance
The following attestation bundles were made for reactionclassifier-0.1.0.tar.gz:
Publisher:
publish.yml on schwallergroup/ReactionClassifier
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
reactionclassifier-0.1.0.tar.gz -
Subject digest:
d0c0067cd4018fbb855c398fd8b0f04e8dce22ef63b3665f729ebaf23e4bf241 - Sigstore transparency entry: 2038201871
- Sigstore integration time:
-
Permalink:
schwallergroup/ReactionClassifier@65aeb42104850f5b1719c74c96c5eb7db32bab10 -
Branch / Tag:
refs/tags/First_Release - Owner: https://github.com/schwallergroup
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@65aeb42104850f5b1719c74c96c5eb7db32bab10 -
Trigger Event:
release
-
Statement type:
File details
Details for the file reactionclassifier-0.1.0-py3-none-any.whl.
File metadata
- Download URL: reactionclassifier-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f780b86919b40e45d2532167d3f4ec8e53f0923e815f00328d96b504f50ad4f
|
|
| MD5 |
c64bb971fd155ea345fa47c3f507b1cb
|
|
| BLAKE2b-256 |
68a82859d07b9f4c03e2e7e27879209b6b2786c1ae95b234cce85808d81a6fb0
|
Provenance
The following attestation bundles were made for reactionclassifier-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on schwallergroup/ReactionClassifier
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
reactionclassifier-0.1.0-py3-none-any.whl -
Subject digest:
3f780b86919b40e45d2532167d3f4ec8e53f0923e815f00328d96b504f50ad4f - Sigstore transparency entry: 2038202310
- Sigstore integration time:
-
Permalink:
schwallergroup/ReactionClassifier@65aeb42104850f5b1719c74c96c5eb7db32bab10 -
Branch / Tag:
refs/tags/First_Release - Owner: https://github.com/schwallergroup
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@65aeb42104850f5b1719c74c96c5eb7db32bab10 -
Trigger Event:
release
-
Statement type: