Molecular Complexity Calculations
Project description
___ ___ ___ ___ ___ ___ ___ ___
/\ \ /\ \ /\__\ /\ \ /\ \ /\ \ /\__\ /| |
|::\ \ /::\ \ /:/ / /::\ \ |::\ \ /::\ \ /:/ _/_ |:| |
|:|:\ \ /:/\:\ \ /:/ / /:/\:\ \ |:|:\ \ /:/\:\__\ /:/ /\__\ |:| |
__|:|\:\ \ /:/ \:\ \ ___ ___ /:/ / ___ /:/ \:\ \ __|:|\:\ \ /:/ /:/ / ___ ___ /:/ /:/ _/_ __|:|__|
/::::|_\:\__\ /:/__/ \:\__\ /\ \ /\__\ /:/__/ /\__\ /:/__/ \:\__\ /::::|_\:\__\ /:/_/:/ / /\ \ /\__\ /:/_/:/ /\__\ /::::\__\_____
\:\~~\ \/__/ \:\ \ /:/ / \:\ \ /:/ / \:\ \ /:/ / \:\ \ /:/ / \:\~~\ \/__/ \:\/:/ / \:\ \ /:/ / \:\/:/ /:/ / ~~~~\::::/___/
\:\ \ \:\ /:/ / \:\ /:/ / \:\ /:/ / \:\ /:/ / \:\ \ \::/__/ \:\ /:/ / \::/_/:/ / |:|~~|
\:\ \ \:\/:/ / \:\/:/ / \:\/:/ / \:\/:/ / \:\ \ \:\ \ \:\/:/ / \:\/:/ / |:| |
\:\__\ \::/ / \::/ / \::/ / \::/ / \:\__\ \:\__\ \::/ / \::/ / |:|__|
\/__/ \/__/ \/__/ \/__/ \/__/ \/__/ \/__/ \/__/ \/__/ |/__/
Implementing a variety of complementary metrics for molecular complexity and synthetic accessibility.
A collaboration with the Sarpong group to understand complexity of molecules
Requirements
- numpy, pandas
- rdkit
- openbabel
- mordred
- SYBA (conda install -c lich syba)
Set up conda environment directly using the yml file:
To install the required packages through Conda, use the env.yml file as follows and the activate the environment:
conda env create -f env.yml
conda activate mc
This will set up the environment with molcomplex installed.
For installation by cloning the GitHub folder, perform the follwoing steps:
- Download the zipped folder or clone using:
git clone https://github.com/patonlab/molcomplex.git
- Navigate to the installed folder and run:
python setup.py install
. This will installmolcomplex
in the environment you are present in. - Install necessary dependencies using the following:
conda install -c lich syba
,conda install -c conda-forge rdkit
, andconda install -c conda-forge openbabel
Recommended installation and update guide (under works)
In a nutshell, molcomplex
and its dependencies are installed/updated as follows:
- Install using conda-forge:
conda install -c conda-forge molcomplex
- Update to the latest version:
pip install molcomplex --upgrade
Usage
To display the options type:
python -m molcomplex -h
The molcomplex
package can be utilised as follows to obtain a csv with complexity scores.
python -m molcomplex -f examples/test.txt
To write to CSV add in the following:
python -m molcomplex -f examples/test.txt --csv
To perform a retro analysis by breaking down bonds to get complexity scores for precursors of the input SMILES add the following option:
python -m molcomplex -f examples/test.txt --csv --retro
Usage APP
To run the web app perform the following steps:
- Navigate to the webapp folder:
cd mcwebapp
- Run the app as follows:
python molcomplexapp.py
- copy paste the
http://127.0.0.1:8050/
or similar into web browser to utilise as an app.
Metrics implemented
- Bertz Complexity (CT) Score (JACS 1981, 103, 3241-3243)
- Balaban J Score (Chem. Phys. Lett. 1982, 89, 399-404)
- Coley SCScore (J. Chem. Inf. Model. 2018, 58, 2, 252)
- IPC: Bonchev & Trinajstic's information content of the coefficients of the characteristic polynomial of the adjacency matrix of a hydrogen-suppressed graph of a molecule (J. Chem. Phys. 1977, 67, 4517-4533)
- Ertl SA_Score (J. Cheminform. 2009, 1, 8)
- Boettcher Score (J. Chem. Inf. Model. 2016, 56, 3, 462–470)
- Rücker's total walk count (twc) index: Rücker, G.; Rücker, C. Counts of All Walks as Atomic and Molecular Descriptors. (J. Chem. Inf. Comput. Sci. 1993, 33, 683-695)
- Proudfoot's Cm index based on atom environments: Proudfoot, J. R. A path based approach to assessing molecular complexity. Bioorganic Med. Chem. Lett. 27, 2014–2017 (2017)
- Kappa Shape Indices 1, 2 & 3 (Quant. Struct. Act. Relat. 1986, 5, 1-7)
- McGowan Volume (Chromatographia, 1987, 23, 243-246)
- Labute Approximate Surface Area (Methods Mol Biol 2004, 275, 261-78)
- Van der Waals Volume Atomic and Bond Contributions (J. Org. Chem. 2003, 68, 7368-7373).
- Zagreb Index
- MOE Type Desciptors (Labute ASA, PEOE VSA, SMR VSA, SLogP VSA)
- SYBA Score (J. Cheminformatics 2020, 12, 35)
- Multiple additional 2D metrics.
To do list
- compare against human metric
- comment out descriptors that are highly correlated (say > 0.9)
Currently broken
- Kier's alpha-modified shape indices
Metrics to implement:
-
Bertz’s Ns and Nt index: Bertz, S. H. & Sommer, T. J. Rigorous mathematical approaches to strategic bonds and synthetic analysis based on conceptually simple new complexity indices. Chem. Commun. 16, 2409–2410 (1997).
-
Randić's zeta index: Randić, M. & Plavšić, D. Characterization of molecular complexity. Int. J. Quantum Chem. 91, 20–31 (2002).
Two noteworthy substructure-based methods are:
-
Barone, R. & Chanon, M. A new and simple approach to chemical complexity. Application to the synthesis of natural products. J. Chem. Inf. Comput. Sci. 41, 269–272 (2001).
-
Whitlock, H. W. On the structure of total synthesis of complex natural products. J. Org. Chem. 63, 7982–7989 (1998).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file molcomplex-1.0.3.tar.gz
.
File metadata
- Download URL: molcomplex-1.0.3.tar.gz
- Upload date:
- Size: 69.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a551ecf6bfb660440af36c1f507c22bbcdd26cb4ce929b45b360ecdecd0dfc8d |
|
MD5 | 276220806b9e018569fba1c8434609fd |
|
BLAKE2b-256 | e177aaf2988cdceb4b1934988b67fc6013d0431d524afb0505730386468bc32f |
File details
Details for the file molcomplex-1.0.3-py2.py3-none-any.whl
.
File metadata
- Download URL: molcomplex-1.0.3-py2.py3-none-any.whl
- Upload date:
- Size: 69.9 MB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 016252698b3083bdd01951de4c2dc38726e7e8255d11108fc5c3554a71b75e9f |
|
MD5 | 420491347fba356ab0d783f3cfa2fb9e |
|
BLAKE2b-256 | 6aaa836764e96e8d478829bc53d1591e8ec3373bb14a88f599b65b37e034c63f |