Skip to main content

Reaction Database for benchmarking

Project description

SynRXN

PyPI version Release Last Commit Zenodo CI Stars

Reaction Database for Benchmarking SynRXN is a curated, provenance-tracked collection of reaction datasets and evaluation manifests designed for reproducible benchmarking of reaction-informatics tasks (rebalancing, atom-atom mapping, reaction classification, property prediction, and synthesis/retrosynthesis). It provides standardized splits, manifest files (RNG seeds & split indices), and lightweight utilities to load and inspect datasets for fair, reproducible model comparison.

Installation

  1. Python Installation: Ensure that Python 3.11 or later is installed on your system. You can download it from python.org.

  2. Creating a Virtual Environment (Optional but Recommended): It's recommended to use a virtual environment to avoid conflicts with other projects or system-wide packages. Use the following commands to create and activate a virtual environment:

python -m venv synrxn-env
source synrxn-env/bin/activate  

Or Conda

conda create --name synrxn-env python=3.11
conda activate synrxn-env
  1. Install from PyPi: The easiest way to use SynTemp is by installing the PyPI package synrxn.
pip install synrxn

Optional if you want to install full version

pip install synrxn[all]

Example

from synrxn.data import DataLoader
from synrxn.split.repeated_kfold import RepeatedKFoldsSplitter

# create a loader for the 'property' task (GitHub backend; resolve metadata on init)
dl = DataLoader(task="property", source="github", gh_enable=True, resolve_on_init=True)

# list available dataset names (returns a list[str])
print(dl.available_names())
# expected stdout (example):
# ['b97xd3', 'lograte', 'rgd1', 'cycloadd', 'phosphatase', 'sn2',
#  'e2', 'rad6re', 'snar', 'e2sn2', ...]

# load the 'b97xd3' dataset (returns a pandas.DataFrame)
df = dl.load("b97xd3")

splitter = RepeatedKFoldsSplitter(
    n_splits=5,
    n_repeats=5,
    ratio=(8, 1, 1),
    shuffle=True,
    random_state=42,
)

# compute splits (no stratification in this example)
splitter.split(df, stratify_col=None)

# retrieve one specific split (repeat 0, fold 0) as pandas DataFrames
train_df, val_df, test_df = splitter.get_split(repeat=0, fold=0, as_frame=True)

# quick checks
print(type(train_df), type(val_df), type(test_df))     # <class 'pandas.core.frame.DataFrame'> ...
print(len(train_df), len(val_df), len(test_df))        # e.g.  (N_train, N_val, N_test)
print(train_df.columns.tolist())                       # list of column names          

Contributing

Publication

SynRXN: A Benchmarking Framework and Open Data Repository for Computer-Aided Synthesis Planning

License

This project is licensed under MIT License - see the License file for details.

Acknowledgments

This project has received funding from the European Unions Horizon Europe Doctoral Network programme under the Marie-Skłodowska-Curie grant agreement No 101072930 (TACsy -- Training Alliance for Computational)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synrxn-0.0.6.tar.gz (52.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synrxn-0.0.6-py3-none-any.whl (63.9 kB view details)

Uploaded Python 3

File details

Details for the file synrxn-0.0.6.tar.gz.

File metadata

  • Download URL: synrxn-0.0.6.tar.gz
  • Upload date:
  • Size: 52.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synrxn-0.0.6.tar.gz
Algorithm Hash digest
SHA256 4219c46f21a0c014f4609eb1ee4d3b3e6e2b90ec0cd2c82472a990c9eae17be9
MD5 2c955bcd06969a4035b409605888244d
BLAKE2b-256 1df5feb732baa807ce126a4dec7a72527ad9e237bee99fc172d9f1ef36c26a60

See more details on using hashes here.

Provenance

The following attestation bundles were made for synrxn-0.0.6.tar.gz:

Publisher: publish-package.yml on TieuLongPhan/SynRXN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file synrxn-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: synrxn-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 63.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for synrxn-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 9a653b240e01d3ee33d6648b62ef79f4d54b6fcee56beb182c10b50efb5a93da
MD5 a19620bbecb1ed522e30e631a2ae6911
BLAKE2b-256 74cbfba78261772e1c018bbb88c9e5f35a693e6716fa3b024f96889f5f90c36d

See more details on using hashes here.

Provenance

The following attestation bundles were made for synrxn-0.0.6-py3-none-any.whl:

Publisher: publish-package.yml on TieuLongPhan/SynRXN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page