Reaction Database for benchmarking
Project description
SynRXN
Reaction Database for Benchmarking SynRXN is a curated, provenance-tracked collection of reaction datasets and evaluation manifests designed for reproducible benchmarking of reaction-informatics tasks (rebalancing, atom-atom mapping, reaction classification, property prediction, and synthesis/retrosynthesis). It provides standardized splits, manifest files (RNG seeds & split indices), and lightweight utilities to load and inspect datasets for fair, reproducible model comparison.
Installation
-
Python Installation: Ensure that Python 3.11 or later is installed on your system. You can download it from python.org.
-
Creating a Virtual Environment (Optional but Recommended): It's recommended to use a virtual environment to avoid conflicts with other projects or system-wide packages. Use the following commands to create and activate a virtual environment:
python -m venv synrxn-env
source synrxn-env/bin/activate
Or Conda
conda create --name synrxn-env python=3.11
conda activate synrxn-env
- Install from PyPi: The easiest way to use SynTemp is by installing the PyPI package synrxn.
pip install synrxn
Optional if you want to install full version
pip install synrxn[all]
Example
from synrxn.data import DataLoader
from synrxn.split.repeated_kfold import RepeatedKFoldsSplitter
# 1) Zenodo (stable release)
from pathlib import Path
from synrxn import DataLoader
dl = DataLoader(
task="classification",
source="zenodo",
version="0.0.6",
cache_dir=Path("~/.cache/synrxn").expanduser(),
)
print(dl.available_names()) # list available datasets
df = dl.load("schneider_b")
print(len(df), df.columns.tolist())
# 2) GitHub release tag
from pathlib import Path
from synrxn.data import DataLoader
dl = DataLoader(
task="classification",
source="github",
version="v0.0.6",
cache_dir=Path("~/.cache/synrxn").expanduser(),
gh_enable=True,
)
print(dl.available_names())
df = dl.load("schneider_b")
print(len(df))
# 3) GitHub commit (pin to SHA)
from pathlib import Path
from synrxn.data import DataLoader
dl = DataLoader(
task="classification",
source="commit",
version="3e1612e2199e8b0e369fce3ed9aff3dda68e4c32",
cache_dir=Path("~/.cache/synrxn").expanduser(),
gh_enable=True,
)
print(dl.available_names())
df = dl.load("schneider_b")
print(df.head(2))
# 4) GitHub latest
from pathlib import Path
from synrxn.data import DataLoader
dl = DataLoader(
task="classification",
source="github",
version="latest",
cache_dir=Path("~/.cache/synrxn").expanduser(),
gh_enable=True,
)
print(dl.available_names())
df = dl.load("schneider_b")
print(df.shape)
# Simple splitting example (property dataset)
from synrxn.data import DataLoader
from synrxn.split.repeated_kfold import RepeatedKFoldsSplitter
from pathlib import Path
dl = DataLoader(
task="property",
source="commit",
version="latest",
cache_dir=Path("~/.cache/synrxn").expanduser(),
gh_enable=True,
)
df = dl.load("b97xd3")
splitter = RepeatedKFoldsSplitter(
n_splits=5, n_repeats=2, ratio=(8,1,1), shuffle=True, random_state=1
)
splitter.prepare_splits(df, stratify=None)
train_df, val_df, test_df = splitter.get_split(0, 0, as_frame=True)
print(len(train_df), len(val_df), len(test_df))
Contributing
Publication
SynRXN: A Benchmarking Framework and Open Data Repository for Computer-Aided Synthesis Planning
License
This project is licensed under MIT License - see the License file for details.
Acknowledgments
This project has received funding from the European Unions Horizon Europe Doctoral Network programme under the Marie-Skłodowska-Curie grant agreement No 101072930 (TACsy -- Training Alliance for Computational)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file synrxn-0.0.7.tar.gz.
File metadata
- Download URL: synrxn-0.0.7.tar.gz
- Upload date:
- Size: 63.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8953cd7a542617ab9485934b41b0ccdaff8c612b6cc774e7724c26ce6b1fda28
|
|
| MD5 |
88aa4e06eac7f29318bf3ca984bc5967
|
|
| BLAKE2b-256 |
d6c45116b822688abc5b0598ba79bb3e46523f17bbe1a267fb585b3be58f0028
|
Provenance
The following attestation bundles were made for synrxn-0.0.7.tar.gz:
Publisher:
publish-package.yml on TieuLongPhan/SynRXN
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
synrxn-0.0.7.tar.gz -
Subject digest:
8953cd7a542617ab9485934b41b0ccdaff8c612b6cc774e7724c26ce6b1fda28 - Sigstore transparency entry: 685723967
- Sigstore integration time:
-
Permalink:
TieuLongPhan/SynRXN@2e430d09d24e3597c61d7808ea97efb208a41d2e -
Branch / Tag:
refs/tags/v0.0.7 - Owner: https://github.com/TieuLongPhan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-package.yml@2e430d09d24e3597c61d7808ea97efb208a41d2e -
Trigger Event:
release
-
Statement type:
File details
Details for the file synrxn-0.0.7-py3-none-any.whl.
File metadata
- Download URL: synrxn-0.0.7-py3-none-any.whl
- Upload date:
- Size: 77.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b225434ef89356e2e498f0731d37edee8c4e45b1e66ab04322f178401963751a
|
|
| MD5 |
1a1b807f32cd85cdd53e2d860b62ed91
|
|
| BLAKE2b-256 |
d80461c4922982deb6d598e0e9eae512b4b2050051a06e22b9ad3a584b30acf8
|
Provenance
The following attestation bundles were made for synrxn-0.0.7-py3-none-any.whl:
Publisher:
publish-package.yml on TieuLongPhan/SynRXN
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
synrxn-0.0.7-py3-none-any.whl -
Subject digest:
b225434ef89356e2e498f0731d37edee8c4e45b1e66ab04322f178401963751a - Sigstore transparency entry: 685723969
- Sigstore integration time:
-
Permalink:
TieuLongPhan/SynRXN@2e430d09d24e3597c61d7808ea97efb208a41d2e -
Branch / Tag:
refs/tags/v0.0.7 - Owner: https://github.com/TieuLongPhan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-package.yml@2e430d09d24e3597c61d7808ea97efb208a41d2e -
Trigger Event:
release
-
Statement type: