Skip to main content

Read the full contents of CTAB .rdf files in python. Captures RXN and MOL record using RDKit and reads additional data fields (including solvents/catalysts/agents).

Project description

RDF READER

Coverage Status pre-commit.ci status Tests License Code style: black Python versions

User Guide

Installation

pip install rdfreader

Basic Usage

from rdfreader import RDFParser

rdf_file_name = "reactions.rdf"

with open(rdf_file_name, "r") as rdf_file:

    # create a RDFParser object, this is a generator that yields Reaction objects
    rdfreader = RDFParser(
        rdf_file,
        except_on_invalid_molecule=False,  # will return None instead of raising an exception if a molecule is invalid
        except_on_invalid_reaction=False,  # will return None instead of raising an exception if a reaction is invalid 
    )

    for rxn in rdfreader:
        if rxn is None:
            continue # the parser failed to read the reaction, go to the next one
  
        # rxn is a Reaction object, it is several attributes, including:
        print(rxn.smiles) # reaction SMILES string
        print(rxn.properties) # a dictionary of properties extracted from the RXN record
        
        reactants = rxn.reactants # a list of Molecule objects
        products = rxn.products
        solvents = rxn.solvents 
        catalysts = rxn.catalysts 
 
        # Molecule objects have several attributes, including:
        print(reactants[0].smiles)
        print(reactants[0].properties) # a dictionary of properties extracted from the MOL record (often empty)
        reactants[0].rd_mol # an RDKit molecule object

Developer Guide

The project is managed and packaged using poetry.

Installation

git clone https://github.com/deepmatterltd/rdfreader
poetry install  # create a virtual environment and install the project dependencies
pre-commit install  # install pre-commit hooks, these mostly manage codestyle

Contributions

Contributions are welcome via the fork and pull request model.

Before you commit changes, ensure these pass the hooks installed by pre-commit. This should be run automatically on each commit if you have run pre-commit install, but can be run manually from the terminal with pre-commit run.

Releases

Releases are managed by GitHub releases/workflow. The version number in the pyproject file should ideally be kept up to date to the current release but is ignored by the release workflow.

To release a new version:

  • Update the pyproject.toml version number.
  • Push the changes to GitHub and merge to main via a pull request.
  • Use the github website to create a release. Tag the commit to be released with a version number, e.g. v1.2.3. The tag should be in v*.. and match the version number in the pyproject.toml file.
  • When the release is published, a github workflow will run, build a wheel and publish it to PyPI.

Example Data

You can find example data in the test/resources directory. spresi-100.rdf contains 100 example records from SPRESI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdfreader-1.0.2.tar.gz (13.3 kB view hashes)

Uploaded Source

Built Distribution

rdfreader-1.0.2-py3-none-any.whl (15.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page