Skip to main content

Reaction Database for benchmarking

Project description

SynRXN

PyPI version Release Last Commit Zenodo CI Stars

Reaction Database for Benchmarking SynRXN is a curated, provenance-tracked collection of reaction datasets and evaluation manifests designed for reproducible benchmarking of reaction-informatics tasks (rebalancing, atom-atom mapping, reaction classification, property prediction, and synthesis/retrosynthesis). It provides standardized splits, manifest files (RNG seeds & split indices), and lightweight utilities to load and inspect datasets for fair, reproducible model comparison.

Installation

  1. Python Installation: Ensure that Python 3.11 or later is installed on your system. You can download it from python.org.

  2. Creating a Virtual Environment (Optional but Recommended): It's recommended to use a virtual environment to avoid conflicts with other projects or system-wide packages. Use the following commands to create and activate a virtual environment:

python -m venv synrxn-env
source synrxn-env/bin/activate  

Or Conda

conda create --name synrxn-env python=3.11
conda activate synrxn-env
  1. Install from PyPi: The easiest way to use SynTemp is by installing the PyPI package synrxn.
pip install synrxn

Optional if you want to install full version

pip install synrxn[all]

Example

from synrxn.data import DataLoader
from synrxn.split.repeated_kfold import RepeatedKFoldsSplitter

# 1) Zenodo (stable release)
from pathlib import Path
from synrxn import DataLoader

dl = DataLoader(
    task="classification",
    source="zenodo",
    version="0.0.6",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
)
print(dl.available_names())   # list available datasets
df = dl.load("schneider_b")
print(len(df), df.columns.tolist())

# 2) GitHub release tag
from pathlib import Path
from synrxn.data import DataLoader

dl = DataLoader(
    task="classification",
    source="github",
    version="v0.0.6",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
    gh_enable=True,
)
print(dl.available_names())
df = dl.load("schneider_b")
print(len(df))

# 3) GitHub commit (pin to SHA)
from pathlib import Path
from synrxn.data import DataLoader

dl = DataLoader(
    task="classification",
    source="commit",
    version="3e1612e2199e8b0e369fce3ed9aff3dda68e4c32",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
    gh_enable=True,
)
print(dl.available_names())
df = dl.load("schneider_b")
print(df.head(2))

# 4) GitHub latest
from pathlib import Path
from synrxn.data import DataLoader

dl = DataLoader(
    task="classification",
    source="github",
    version="latest",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
    gh_enable=True,
)
print(dl.available_names())
df = dl.load("schneider_b")
print(df.shape)

# Simple splitting example (property dataset)
from synrxn.data import DataLoader
from synrxn.split.repeated_kfold import RepeatedKFoldsSplitter
from pathlib import Path

dl = DataLoader(
    task="property",
    source="commit",
    version="latest",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
    gh_enable=True,
)
df = dl.load("b97xd3")

splitter = RepeatedKFoldsSplitter(
    n_splits=5, n_repeats=2, ratio=(8,1,1), shuffle=True, random_state=1
)

splitter.prepare_splits(df, stratify=None)           
train_df, val_df, test_df = splitter.get_split(0, 0, as_frame=True)
print(len(train_df), len(val_df), len(test_df))

Contributing

Publication

SynRXN: A Benchmarking Framework and Open Data Repository for Computer-Aided Synthesis Planning

License

This project is licensed under MIT License - see the License file for details.

Acknowledgments

This project has received funding from the European Unions Horizon Europe Doctoral Network programme under the Marie-Skłodowska-Curie grant agreement No 101072930 (TACsy -- Training Alliance for Computational)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synrxn-0.0.7.tar.gz (63.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synrxn-0.0.7-py3-none-any.whl (77.1 kB view details)

Uploaded Python 3

File details

Details for the file synrxn-0.0.7.tar.gz.

File metadata

  • Download URL: synrxn-0.0.7.tar.gz
  • Upload date:
  • Size: 63.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synrxn-0.0.7.tar.gz
Algorithm Hash digest
SHA256 8953cd7a542617ab9485934b41b0ccdaff8c612b6cc774e7724c26ce6b1fda28
MD5 88aa4e06eac7f29318bf3ca984bc5967
BLAKE2b-256 d6c45116b822688abc5b0598ba79bb3e46523f17bbe1a267fb585b3be58f0028

See more details on using hashes here.

Provenance

The following attestation bundles were made for synrxn-0.0.7.tar.gz:

Publisher: publish-package.yml on TieuLongPhan/SynRXN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file synrxn-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: synrxn-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 77.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synrxn-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 b225434ef89356e2e498f0731d37edee8c4e45b1e66ab04322f178401963751a
MD5 1a1b807f32cd85cdd53e2d860b62ed91
BLAKE2b-256 d80461c4922982deb6d598e0e9eae512b4b2050051a06e22b9ad3a584b30acf8

See more details on using hashes here.

Provenance

The following attestation bundles were made for synrxn-0.0.7-py3-none-any.whl:

Publisher: publish-package.yml on TieuLongPhan/SynRXN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page