Read the full contents of CTAB .rdf files in python. Captures RXN and MOL record using RDKit and reads additional data fields (including solvents/catalysts/agents).
Project description
RDF READER
User Guide
Installation
pip install rdfreader
Basic Usage
from rdfreader import RDFParser
rdf_file_name = "reactions.rdf"
with open(rdf_file_name, "r") as rdf_file:
# create a RDFParser object, this is a generator that yields Reaction objects
rdfreader = RDFParser(
rdf_file,
except_on_invalid_molecule=False, # will return None instead of raising an exception if a molecule is invalid
except_on_invalid_reaction=False, # will return None instead of raising an exception if a reaction is invalid
)
for rxn in rdfreader:
if rxn is None:
continue # the parser failed to read the reaction, go to the next one
# rxn is a Reaction object, it is several attributes, including:
print(rxn.smiles) # reaction SMILES string
print(rxn.properties) # a dictionary of properties extracted from the RXN record
reactants = rxn.reactants # a list of Molecule objects
products = rxn.products
solvents = rxn.solvents
catalysts = rxn.catalysts
# Molecule objects have several attributes, including:
print(reactants[0].smiles)
print(reactants[0].properties) # a dictionary of properties extracted from the MOL record (often empty)
reactants[0].rd_mol # an RDKit molecule object
Example Data
You can find example data in the test/resources directory. spresi-100.rdf contains 100 example records from SPRESI.
Important Note Regarding File Formats
If you are using files that have been saved with Windows-style carriage returns (^M^M, or \r\r), you may encounter issues when running this package.
To correct this issue, you can use the following sed command in a Linux-based terminal to convert double carriage returns to single ones in affected files:
sed -i 's/\r\r/\r/g' reactions.rdf
Developer Guide
The project is managed and packaged using poetry.
Installation
git clone https://github.com/ChemAILtd/rdfreader.git
poetry install # create a virtual environment and install the project dependencies
pre-commit install # install pre-commit hooks, these mostly manage codestyle
Contributions
Contributions are welcome via the fork and pull request model.
Before you commit changes, ensure these pass the hooks installed by pre-commit. This should be run automatically on each commit if you have run pre-commit install, but can be run manually from the terminal with pre-commit run.
Releases
Releases are managed by GitHub releases/workflow. The version number in the pyproject file should ideally be kept up to date to the current release but is ignored by the release workflow.
To release a new version:
- Update the pyproject.toml version number.
- Push the changes to GitHub and merge to main via a pull request.
- Use the github website to create a release. Tag the commit to be released with a version number, e.g. v1.2.3. The tag should be in v*.. and match the version number in the pyproject.toml file.
- When the release is published, a github workflow will run, build a wheel and publish it to PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rdfreader-1.0.5.tar.gz.
File metadata
- Download URL: rdfreader-1.0.5.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65ed17bc029ddd9322e7a11cddc35c31f15fe9f220d9650abd871456a900fd57
|
|
| MD5 |
b7dcfb6f4856cd2aa54884758623bb40
|
|
| BLAKE2b-256 |
11ac0f026793b8eca0b12b8452daf12e240d19a9eae9235978415096937201b8
|
File details
Details for the file rdfreader-1.0.5-py3-none-any.whl.
File metadata
- Download URL: rdfreader-1.0.5-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94562162fa995667628f45b81b0bd3afc3eb1e6b34ef6c6dff6d5cb37936475a
|
|
| MD5 |
9647c8a44cf054a47973bdc3f160d3a9
|
|
| BLAKE2b-256 |
4e71b6c3916b5fbaea6d6828bc52de32025fd7b337612127c6ac1e6ec0cc8804
|