Skip to main content

Python wrapper for CDK molecular descriptors and fingerprints

Project description

License: MIT

CDK Python wrapper

Python wrapper to ease the calculation of CDK molecular descriptors and fingerprints.

Installation

From source:

git clone https://github.com/OlivierBeq/CDK_pywrapper.git
pip install ./CDK_pywrapper

with pip:

pip install CDK-pywrapper

Get started

from CDK_pywrapper import CDK
from rdkit import Chem

smiles_list = [
  # erlotinib
  "n1cnc(c2cc(c(cc12)OCCOC)OCCOC)Nc1cc(ccc1)C#C",
  # midecamycin
  "CCC(=O)O[C@@H]1CC(=O)O[C@@H](C/C=C/C=C/[C@@H]([C@@H](C[C@@H]([C@@H]([C@H]1OC)O[C@H]2[C@@H]([C@H]([C@@H]([C@H](O2)C)O[C@H]3C[C@@]([C@H]([C@@H](O3)C)OC(=O)CC)(C)O)N(C)C)O)CC=O)C)O)C",
  # selenofolate
  "C1=CC(=CC=C1C(=O)NC(CCC(=O)OCC[Se]C#N)C(=O)O)NCC2=CN=C3C(=N2)C(=O)NC(=N3)N",
  # cisplatin
  "N.N.Cl[Pt]Cl"
]
mols = [Chem.AddHs(Chem.MolFromSmiles(smiles)) for smiles in smiles_list]

cdk = CDK()
print(cdk.calculate(mols))

The above calculates 222 molecular descriptors (23 1D and 200 2D).

The additional 65 three-dimensional (3D) descriptors may be obtained with the following: :warning: Molecules are required to have conformers for 3D descriptors to be calculated.

from rdkit.Chem import AllChem

for mol in mols:
    _ = AllChem.EmbedMolecule(mol)

cdk = CDK(ignore_3D=False)
print(cdk.calculate(mols))

To obtain molecular fingerprint, one can used the following:

from CDK_pywrapper import CDK, FPType
cdk = CDK(fingerprint=.PubchemFP)
print(cdk.calculate(mols))

The following fingerprints can be calculated:

FPType Fingerprint name
FP CDK fingerprint
ExtFP Extended CDK fingerprint (includes 25 bits for ring features and isotopic masses)
EStateFP Electrotopological state fingerprint (79 bits)
GraphFP CDK fingerprinter ignoring bond orders
MACCSFP Public MACCS fingerprint
PubchemFP PubChem substructure fingerprint
SubFP Fingerprint describing 307 substructures
KRFP Klekota-Roth fingerprint
AP2DFP Atom pair 2D fingerprint as implemented in PaDEL
HybridFP CDK fingerprint ignoring aromaticity
LingoFP LINGO fingerprint
SPFP Fingerprint based on the shortest paths between two atoms
SigFP Signature fingerprint
CircFP Circular fingerprint

Documentation

class CDK(ignore_3D=True, fingerprint=None, nbits=1024, depth=6):

Constructor of a CDK calculator for molecular descriptors or fingerprints

Parameters:

  • ignore_3D : bool Should 3D molecular descriptors be calculated (default: False). Ignored if a fingerprint is set.
  • fingerprint : FPType
    Type of fingerprint to calculate (default: None). If None, calculate descriptors.
  • nbits : int
    Number of bits in the fingerprint.
  • depth : int
    Depth of the fingerprint.


```python def calculate(mols, show_banner=True, njobs=1, chunksize=1000): ```

Default method to calculate CDK molecular descriptors and fingerprints.

Parameters:

  • mols : Iterable[Chem.Mol]
    RDKit molecule objects for which to obtain CDK descriptors.
  • show_banner : bool
    Displays default notice about CDK.
  • njobs : int
    Maximum number of simultaneous processes.
  • chunksize : int
    Maximum number of molecules each process is charged of.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

CDK_pywrapper-0.1.0.tar.gz (43.2 MB view details)

Uploaded Source

Built Distribution

CDK_pywrapper-0.1.0-py3-none-any.whl (43.2 MB view details)

Uploaded Python 3

File details

Details for the file CDK_pywrapper-0.1.0.tar.gz.

File metadata

  • Download URL: CDK_pywrapper-0.1.0.tar.gz
  • Upload date:
  • Size: 43.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.11

File hashes

Hashes for CDK_pywrapper-0.1.0.tar.gz
Algorithm Hash digest
SHA256 20b1acc5107404b2c333a1386ac4e02166fc49dbbe67fef38a634aeacd1f7676
MD5 463586e81891efb7b82bc00c921c563d
BLAKE2b-256 6a8bf9041378dc54f402bd586eae572b62815998829990e529fa965831a0ecee

See more details on using hashes here.

File details

Details for the file CDK_pywrapper-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for CDK_pywrapper-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b5faa179aec7be01094eb5a090276c1a66a1ee84117d303ee2bc0c70658b4fa6
MD5 e8a4672637f4ba744e2eaafa21b0ad7d
BLAKE2b-256 cad086bc2fe825521183af0652039cd6b621675bc306b0a9ab7531d876e2f1f1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page