Skip to main content

Reaction Database for benchmarking

Project description

SynRXN

PyPI version Release Last Commit Zenodo CI Stars

Reaction Database for Benchmarking SynRXN is a curated, provenance-tracked collection of reaction datasets and evaluation manifests designed for reproducible benchmarking of reaction-informatics tasks (rebalancing, atom-atom mapping, reaction classification, property prediction, and synthesis/retrosynthesis). It provides standardized splits, manifest files (RNG seeds & split indices), and lightweight utilities to load and inspect datasets for fair, reproducible model comparison.

Installation

  1. Python Installation: Ensure that Python 3.11 or later is installed on your system. You can download it from python.org.

  2. Creating a Virtual Environment (Optional but Recommended): It's recommended to use a virtual environment to avoid conflicts with other projects or system-wide packages. Use the following commands to create and activate a virtual environment:

python -m venv synrxn-env
source synrxn-env/bin/activate  

Or Conda

conda create --name synrxn-env python=3.11
conda activate synrxn-env
  1. Install from PyPi: The easiest way to use SynTemp is by installing the PyPI package synrxn.
pip install synrxn

Optional if you want to install full version

pip install synrxn[all]

Example

from synrxn.data import DataLoader
from synrxn.split.repeated_kfold import RepeatedKFoldsSplitter

# 1) Zenodo (stable release)
from pathlib import Path
from synrxn import DataLoader

dl = DataLoader(
    task="classification",
    source="zenodo",
    version="0.0.6",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
)
print(dl.available_names())   # list available datasets
df = dl.load("schneider_b")
print(len(df), df.columns.tolist())

# 2) GitHub release tag
from pathlib import Path
from synrxn.data import DataLoader

dl = DataLoader(
    task="classification",
    source="github",
    version="v0.0.6",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
    gh_enable=True,
)
print(dl.available_names())
df = dl.load("schneider_b")
print(len(df))

# 3) GitHub commit (pin to SHA)
from pathlib import Path
from synrxn.data import DataLoader

dl = DataLoader(
    task="classification",
    source="commit",
    version="3e1612e2199e8b0e369fce3ed9aff3dda68e4c32",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
    gh_enable=True,
)
print(dl.available_names())
df = dl.load("schneider_b")
print(df.head(2))

# 4) GitHub latest
from pathlib import Path
from synrxn.data import DataLoader

dl = DataLoader(
    task="classification",
    source="github",
    version="latest",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
    gh_enable=True,
)
print(dl.available_names())
df = dl.load("schneider_b")
print(df.shape)

# Simple splitting example (property dataset)
from synrxn.data import DataLoader
from synrxn.split.repeated_kfold import RepeatedKFoldsSplitter
from pathlib import Path

dl = DataLoader(
    task="property",
    source="commit",
    version="latest",
    cache_dir=Path("~/.cache/synrxn").expanduser(),
    gh_enable=True,
)
df = dl.load("b97xd3")

splitter = RepeatedKFoldsSplitter(
    n_splits=5, n_repeats=2, ratio=(8,1,1), shuffle=True, random_state=1
)

splitter.prepare_splits(df, stratify=None)           
train_df, val_df, test_df = splitter.get_split(0, 0, as_frame=True)
print(len(train_df), len(val_df), len(test_df))

Contributing

Publication

SynRXN: A Benchmarking Framework and Open Data Repository for Computer-Aided Synthesis Planning

License

This project is licensed under MIT License - see the License file for details.

Acknowledgments

This project has received funding from the European Unions Horizon Europe Doctoral Network programme under the Marie-Skłodowska-Curie grant agreement No 101072930 (TACsy -- Training Alliance for Computational)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synrxn-0.0.8.tar.gz (70.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synrxn-0.0.8-py3-none-any.whl (85.5 kB view details)

Uploaded Python 3

File details

Details for the file synrxn-0.0.8.tar.gz.

File metadata

  • Download URL: synrxn-0.0.8.tar.gz
  • Upload date:
  • Size: 70.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synrxn-0.0.8.tar.gz
Algorithm Hash digest
SHA256 bfe315273126013d7d6003119e3735dca6468a03bd837895322693d887ce7174
MD5 1b3b80b02471f214849a7f1242cbeb39
BLAKE2b-256 16cd78b9b5fa25816bf6141b5b7ec3991f029ed5b8719859deafb7a550804b37

See more details on using hashes here.

Provenance

The following attestation bundles were made for synrxn-0.0.8.tar.gz:

Publisher: publish-package.yml on TieuLongPhan/SynRXN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file synrxn-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: synrxn-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 85.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synrxn-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 a682aa542a13fd36392325f6876d446187ab57d9e2fd09aed2dbb67ddf55734d
MD5 b68d8683c2265823ceabb8e6bb8ca6dc
BLAKE2b-256 06d3b96409a257beecb0e9e6a3886c1382067f34e50f7bbf0d0d36f91ffb9193

See more details on using hashes here.

Provenance

The following attestation bundles were made for synrxn-0.0.8-py3-none-any.whl:

Publisher: publish-package.yml on TieuLongPhan/SynRXN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page