Skip to main content

Read the full contents of CTAB .rdf files in python. Captures RXN and MOL record using RDKit and reads additional data fields (including solvents/catalysts/agents).

Project description

RDF READER

Coverage Status pre-commit.ci status Tests License Code style: black Python versions

User Guide

Installation

pip install rdfreader

Basic Usage

from rdfreader import RDFParser

rdf_file_name = "reactions.rdf"

with open(rdf_file_name, "r") as rdf_file:

    # create a RDFParser object, this is a generator that yields Reaction objects
    rdfreader = RDFParser(
        rdf_file,
        except_on_invalid_molecule=False,  # will return None instead of raising an exception if a molecule is invalid
        except_on_invalid_reaction=False,  # will return None instead of raising an exception if a reaction is invalid
    )

    for rxn in rdfreader:
        if rxn is None:
            continue # the parser failed to read the reaction, go to the next one

        # rxn is a Reaction object, it is several attributes, including:
        print(rxn.smiles) # reaction SMILES string
        print(rxn.properties) # a dictionary of properties extracted from the RXN record

        reactants = rxn.reactants # a list of Molecule objects
        products = rxn.products
        solvents = rxn.solvents
        catalysts = rxn.catalysts

        # Molecule objects have several attributes, including:
        print(reactants[0].smiles)
        print(reactants[0].properties) # a dictionary of properties extracted from the MOL record (often empty)
        reactants[0].rd_mol # an RDKit molecule object

Example Data

You can find example data in the test/resources directory. spresi-100.rdf contains 100 example records from SPRESI.

Important Note Regarding File Formats

If you are using files that have been saved with Windows-style carriage returns (^M^M, or \r\r), you may encounter issues when running this package.

To correct this issue, you can use the following sed command in a Linux-based terminal to convert double carriage returns to single ones in affected files:

sed -i 's/\r\r/\r/g' reactions.rdf

Developer Guide

The project is managed and packaged using poetry.

Installation

git clone https://github.com/ChemAILtd/rdfreader.git
poetry install  # create a virtual environment and install the project dependencies
pre-commit install  # install pre-commit hooks, these mostly manage codestyle

Contributions

Contributions are welcome via the fork and pull request model.

Before you commit changes, ensure these pass the hooks installed by pre-commit. This should be run automatically on each commit if you have run pre-commit install, but can be run manually from the terminal with pre-commit run.

Releases

Releases are managed by GitHub releases/workflow. The version number in the pyproject file should ideally be kept up to date to the current release but is ignored by the release workflow.

To release a new version:

  • Update the pyproject.toml version number.
  • Push the changes to GitHub and merge to main via a pull request.
  • Use the github website to create a release. Tag the commit to be released with a version number, e.g. v1.2.3. The tag should be in v*.. and match the version number in the pyproject.toml file.
  • When the release is published, a github workflow will run, build a wheel and publish it to PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdfreader-1.0.5.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rdfreader-1.0.5-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file rdfreader-1.0.5.tar.gz.

File metadata

  • Download URL: rdfreader-1.0.5.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for rdfreader-1.0.5.tar.gz
Algorithm Hash digest
SHA256 65ed17bc029ddd9322e7a11cddc35c31f15fe9f220d9650abd871456a900fd57
MD5 b7dcfb6f4856cd2aa54884758623bb40
BLAKE2b-256 11ac0f026793b8eca0b12b8452daf12e240d19a9eae9235978415096937201b8

See more details on using hashes here.

File details

Details for the file rdfreader-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: rdfreader-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for rdfreader-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 94562162fa995667628f45b81b0bd3afc3eb1e6b34ef6c6dff6d5cb37936475a
MD5 9647c8a44cf054a47973bdc3f160d3a9
BLAKE2b-256 4e71b6c3916b5fbaea6d6828bc52de32025fd7b337612127c6ac1e6ec0cc8804

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page