Skip to main content

A standalone module to help generate molecular descriptors from various cheminformatics software

Project description

chemdescriptor - Molecular descriptor generator

Generic molecular descriptor generator wrapper around various software packages to simplify the process of getting descriptors

To install

Type:

pip install chemdescriptor

Requirements

  1. Pandas
  2. Working copy of ChemAxon cxcalc

Usage

Currently only supports ChemAxon cxcalc. The module can be expanded to cover other generators as well. Example input files can be found in the examples/ folder of this repo as well as the pip installed package.

Important! The code requires an environment variable CXCALC_PATH to be set, which points to the folder where cxcalc is installed!

Command Line

chemdescriptor-cx -m /path/to/SMILES/file -d /path/to/descriptor/whitelist/json -p 6.8 7.0 7.2 -o output.csv
usage: chemdescriptor-cx [-h] -m MOLECULE -d DESCRIPTORS -p PH [PH ...]
                         [-c COMMANDS] [-pc PHCOMMANDS] -o OUTPUT

optional arguments:
  -h, --help            show this help message and exit
  -m MOLECULE, --molecule MOLECULE
                        Path to input SMILES file
  -d DESCRIPTORS, --descriptors DESCRIPTORS
                        Path to descriptor white list json file
  -p PH [PH ...], --pH PH [PH ...]
                        List of pH values at which to calculate descriptors
  -c COMMANDS, --commands COMMANDS
                        Optional command stems for descriptors in json format
  -pc PHCOMMANDS, --phcommands PHCOMMANDS
                        Optional command stems for pH dependent descriptorsin
                        json format
  -o OUTPUT, --output OUTPUT
                        Path to output file

In code

Set CXCALC_PATH

import os
os.environ['CXCALC_PATH'] = '/path/to/cxcalc'

Import the generator class

from chemdescriptor import ChemAxonDescriptorGenerator

Instantiate a generator

cag = ChemAxonDescriptorGenerator('/path/to/SMILES/file',
                                  '/path/to/descriptor/whitelist/json',
                                  ph_values=[6, 7, 8],
                                  command_stems=None,
                                  ph_command_stems=None)

Generate csv output cag.generate('output.csv')

Notes:

Input SMILES file has a SMILES code in each line.

Descriptor whitelist is a json file of the form:

{
    "descriptors": [
        "refractivity",
        "maximalprojectionarea",
        "maximalprojectionradius",
        "maximalprojectionsize",
        "minimalprojectionarea",
        "minimalprojectionradius",
        "minimalprojectionsize"
    ],
    "ph_descriptors": [
        "avgpol",
        "molpol",
        "vanderwaals",
        "asa",
        "asa+",
        "asa-",
        "asa_hydrophobic",
        "asa_polar",
        "hbda_acc",
        "hbda_don",
        "polar_surface_area"
    ]
}

chemdescriptor expects 2 keys where "descriptors" are generic and "ph_descriptors" are ph dependent descriptors

2 optional dictionaries can be passed to the ChemAxonDescriptorGenerator, "command_stems" and "ph_command_stems". These dictionaries "translate" the above descriptors into commands that ChemAxon cxcalc can understand.

For example, if no value is passed to the ph_command_stems, the following dictionary is used:

_default_ph_command_stems = {
        'avgpol': 'avgpol',
        'molpol': 'molpol',
        'vanderwaals': 'vdwsa',
        'asa': ['molecularsurfacearea', '-t', 'ASA'],
        'asa+': ['molecularsurfacearea', '-t', 'ASA+'],
        'asa-': ['molecularsurfacearea', '-t', 'ASA-'],
        'asa_hydrophobic': ['molecularsurfacearea', '-t', 'ASA_H'],
        'asa_polar': ['molecularsurfacearea', '-t', 'ASA_P'],
        'hbda_acc': 'acceptorcount',
        'hbda_don': 'donorcount',
        'polar_surface_area': 'polarsurfacearea',
    }

Note that commands with multiple words are entries in a list. For example, the command

molecularsurfacearea -t ASA

is represented in the dictionary as ['molecularsurfacearea', '-t', 'ASA']

To Do

[ ] Test on different machines

[ ] Get feedback on what needs to be changed/improved

[ ] Expand to cover other descriptor generators

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chemdescriptor-0.0.4.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

chemdescriptor-0.0.4-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file chemdescriptor-0.0.4.tar.gz.

File metadata

  • Download URL: chemdescriptor-0.0.4.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for chemdescriptor-0.0.4.tar.gz
Algorithm Hash digest
SHA256 c366d20841e6e617a828dc90a84c01d8ef5b0517e309339d599b94e1105f218b
MD5 7f7058140e1ce5671f55c853322fe2bf
BLAKE2b-256 6ed902266388dfa062c57a6b8edfad8fd2dd60ba08daa2f34ba95fb1954d1bbc

See more details on using hashes here.

File details

Details for the file chemdescriptor-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: chemdescriptor-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for chemdescriptor-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 03000980d5c9bf4aaae9367880723914b14e5b14e1020165e614e5fd59e9b911
MD5 0c027cdaaaddde238f02bddf059a039f
BLAKE2b-256 05225e982bd7617d7c721a5eb7973790224ae4c9a0405f3a844613ce7e789fca

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page