Skip to main content

Simple Python interface to OPSIN: Open Parser for Systematic IUPAC nomenclature

Project description

py2opsin

Simple Python interface to OPSIN: Open Parser for Systematic IUPAC nomenclature

GitHub Repo Stars Lifetime Downloads PyPI - License PyPI Continuous Integration

py2opsin demo

Installation

py2opsin can be installed with pip install py2opsin. It has zero Python dependencies (OPSIN v2.8.0 is included in the PyPI package) and should work inside any environment running modern Python. Java 8+ is required to run OPSIN.

Try a demo of py2opsin live on your browser (no installation required!): Open In Colab

Usage

Command-line arguments available in OPSIN can be passed through to py2opsin:

from py2opsin import py2opsin

>> smiles_string = py2opsin(
    chemical_name = "ethane",
    output_format = "SMILES",
)
smiles_str = "CC"

>> py2opsin(
    chemical_name: str or list of strings,
    output_format: str = "SMILES",
    allow_acid = False,
    allow_radicals = True,
    allow_bad_stereo = False,
    wildcard_radicals = False,
    jar_fpath = "/path/to/opsin.jar",
    tmp_fpath = "py2opsin_temp_input.txt",
)

The result is returned as a Python string, or False if an unexpected error occurs when calling OPSIN. If a list of IUPAC names is provided, a list is returned. It is highly recommended to use py2opsin in this manner if you need to resolve any more than a couple names -- the performance cost of running OPSIN from Python one name at a time is significant (~5 seconds/molecule individually, milliseconds otherwise).

Arguments:

  • chemical_name (str): IUPAC name of chemical as a Python string, or a list of strings.
  • output_format (str, optional): One of "SMILES", "CML", "InChI", "StdInChI", or "StdInChIKey". Defaults to "SMILES".
  • allow_acid (bool, optional): Allow interpretation of acids. Defaults to False.
  • allow_radicals (bool, optional): Enable radical interpretation. Defaults to False.
  • allow_bad_stereo (bool, optional): Allow OPSIN to ignore uninterpreatable stereochem. Defaults to False.
  • wildcard_radicals (bool, optional): Output radicals as wildcards. Defaults to False.
  • jar_fpath (str, optional): Filepath to OPSIN jar file. Defaults to "opsin-cli.jar" which is distributed with py2opsin.
  • tmp_fpath (str, optional): tmp_fpath (str, optional): Name for temporary file used for calling OPSIN. Defaults to "py2opsin_temp_input.txt". When multiprocessing, set this to a unique name for each process.

[!TIP] OPSIN will already parallelize itself by creating multiple threads! Be wary when using py2opsin with multiprocessing to avoid spawning too many processes.

Massive speedup from pubchempy for batch translations

py2opsin runs locally and is smaller in scope in what it provides, which makes it dramatically faster at resolving identifiers. In the code block below, the call to py2opsin will execute faster than an equivalent call to pubchempy:

import time

from pubchempy import PubChemHTTPError, get_compounds
from py2opsin import py2opsin

compound_list = [
    "pyridine, 2-amino-",
    "pyridine, 3-iodo-",
...
    "aniline, 2,4,6-trinitro-",
]

for compound in compound_list:
    result = get_compounds(compound, "name")

smiles_strings = py2opsin(compound_list)

Examples

  • Jeremy Monat's (@bertiewooster) fantastic blog post using py2opsin to help explore the Wiener Index by enabling translation from IUPAC names into molecules directly from the original paper.

Online Documentation

Click here to read the documentation

Citation

Please check the OPSIN repository for the latest citation information, which as of October 2023 suggests citing this paper:

Chemical Name to Structure: OPSIN, an Open Source Solution
Daniel M. Lowe, Peter T. Corbett, Peter Murray-Rust, Robert C. Glen
Journal of Chemical Information and Modeling 2011 51 (3), 739-753

You may also see fit to mention that you used py2opsin to run OPSIN, but py2opsin itself isn't a significant scholarly effort and thus does not have a DOI. Providing a link to this GitHub repository along with the version of py2opsin used is sufficient.

Contributing & Developer Notes

Pull Requests, Bug Reports, and all Contributions are welcome! Please use the appropriate issue or pull request template when making a contribution.

When submitting a PR, please mark your PR with the "PR Ready for Review" label when you are finished making changes so that the GitHub actions bots can work their magic!

Developer Install

To contribute to the py2opsin source code, start by cloning the repository (i.e. git clone git@github.com:JacksonBurns/py2opsin.git) and then inside the repository run pip install -e .[dev]. This will set you up with all the required dependencies to run py2opsin and conform to our formatting standards (black and isort), which you can configure to run automatically in vscode like this.

Unit and performance tests can then be executed with pytest.

Note for Windows Powershell or MacOS Catalina or newer: On these systems the command line will complain about square brackets, so you will need to double quote the install command (i.e. pip install -e ".[dev]").

License

OPSIN and py2opsin are both distributed under the MIT license.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py2opsin-1.1.0.tar.gz (13.0 MB view details)

Uploaded Source

Built Distribution

py2opsin-1.1.0-py3-none-any.whl (13.0 MB view details)

Uploaded Python 3

File details

Details for the file py2opsin-1.1.0.tar.gz.

File metadata

  • Download URL: py2opsin-1.1.0.tar.gz
  • Upload date:
  • Size: 13.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for py2opsin-1.1.0.tar.gz
Algorithm Hash digest
SHA256 bcabca5bc5f688d490940329f1c0a6e47888835b09603a4cdb9ffb229171d7a5
MD5 16fe36dd5f9b0ea0142b4ac73cfb0f10
BLAKE2b-256 5d0d282defa3b2f3963f3c522d80c7dcf7444d381dfe5ab57f829bbbcf85261c

See more details on using hashes here.

File details

Details for the file py2opsin-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: py2opsin-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for py2opsin-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 643cd43bdce865359ee1df3111f70fa4b320ce6e2dba8fc9eb26b47783593c2c
MD5 7d448356481808f0f156a672a384757e
BLAKE2b-256 4338bccc45755e7bde907e7edd767f7fc621716e37dcbadeaf3bfb3ceaf73164

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page