A standalone module to help generate molecular descriptors from various cheminformatics software
Project description
chemdescriptor - Molecular descriptor generator
Generic molecular descriptor generator wrapper around various software packages to simplify the process of getting descriptors
To install
Type:
pip install chemdescriptor
Requirements
- Pandas
- Working copy of ChemAxon cxcalc
Usage
Currently only supports ChemAxon cxcalc. The module can be expanded to cover other generators as well. Example input files can be found in the examples/ folder of this repo as well as the pip installed package.
Important! The code requires an environment variable CXCALC_PATH to be set, which points to the folder where cxcalc is installed!
Command Line
chemdescriptor-cx -m /path/to/SMILES/file -d /path/to/descriptor/whitelist/json -p 6.8 7.0 7.2 -o output.csv
usage: chemdescriptor-cx [-h] -m MOLECULE -d DESCRIPTORS -p PH [PH ...]
[-c COMMANDS] [-pc PHCOMMANDS] -o OUTPUT
optional arguments:
-h, --help show this help message and exit
-m MOLECULE, --molecule MOLECULE
Path to input SMILES file
-d DESCRIPTORS, --descriptors DESCRIPTORS
Path to descriptor white list json file
-p PH [PH ...], --pH PH [PH ...]
List of pH values at which to calculate descriptors
-c COMMANDS, --commands COMMANDS
Optional command stems for descriptors in json format
-pc PHCOMMANDS, --phcommands PHCOMMANDS
Optional command stems for pH dependent descriptorsin
json format
-o OUTPUT, --output OUTPUT
Path to output file
In code
Set CXCALC_PATH
import os
os.environ['CXCALC_PATH'] = '/path/to/cxcalc'
Import the generator class
from chemdescriptor import ChemAxonDescriptorGenerator
Instantiate a generator
cag = ChemAxonDescriptorGenerator('/path/to/SMILES/file',
'/path/to/descriptor/whitelist/json',
ph_values=[6, 7, 8],
command_stems=None,
ph_command_stems=None)
Generate csv output
cag.generate('output.csv')
Notes:
Input SMILES file has a SMILES code in each line.
Descriptor whitelist is a json file of the form:
{
"descriptors": [
"refractivity",
"maximalprojectionarea",
"maximalprojectionradius",
"maximalprojectionsize",
"minimalprojectionarea",
"minimalprojectionradius",
"minimalprojectionsize"
],
"ph_descriptors": [
"avgpol",
"molpol",
"vanderwaals",
"asa",
"asa+",
"asa-",
"asa_hydrophobic",
"asa_polar",
"hbda_acc",
"hbda_don",
"polar_surface_area"
]
}
chemdescriptor expects 2 keys where "descriptors" are generic and "ph_descriptors" are ph dependent descriptors
2 optional dictionaries can be passed to the ChemAxonDescriptorGenerator, "command_stems" and "ph_command_stems". These dictionaries "translate" the above descriptors into commands that ChemAxon cxcalc can understand.
For example, if no value is passed to the ph_command_stems, the following dictionary is used:
_default_ph_command_stems = {
'avgpol': 'avgpol',
'molpol': 'molpol',
'vanderwaals': 'vdwsa',
'asa': ['molecularsurfacearea', '-t', 'ASA'],
'asa+': ['molecularsurfacearea', '-t', 'ASA+'],
'asa-': ['molecularsurfacearea', '-t', 'ASA-'],
'asa_hydrophobic': ['molecularsurfacearea', '-t', 'ASA_H'],
'asa_polar': ['molecularsurfacearea', '-t', 'ASA_P'],
'hbda_acc': 'acceptorcount',
'hbda_don': 'donorcount',
'polar_surface_area': 'polarsurfacearea',
}
Note that commands with multiple words are entries in a list. For example, the command
molecularsurfacearea -t ASA
is represented in the dictionary as ['molecularsurfacearea', '-t', 'ASA']
To Do
[ ] Test on different machines
[ ] Get feedback on what needs to be changed/improved
[ ] Expand to cover other descriptor generators
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file chemdescriptor-0.0.4.tar.gz
.
File metadata
- Download URL: chemdescriptor-0.0.4.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c366d20841e6e617a828dc90a84c01d8ef5b0517e309339d599b94e1105f218b |
|
MD5 | 7f7058140e1ce5671f55c853322fe2bf |
|
BLAKE2b-256 | 6ed902266388dfa062c57a6b8edfad8fd2dd60ba08daa2f34ba95fb1954d1bbc |
File details
Details for the file chemdescriptor-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: chemdescriptor-0.0.4-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03000980d5c9bf4aaae9367880723914b14e5b14e1020165e614e5fd59e9b911 |
|
MD5 | 0c027cdaaaddde238f02bddf059a039f |
|
BLAKE2b-256 | 05225e982bd7617d7c721a5eb7973790224ae4c9a0405f3a844613ce7e789fca |