Skip to main content

ChEMBL Structure Pipeline

Project description

CI Testing License: MIT

ChEMBL Structure Pipeline

ChEMBL protocols used to standardise and salt strip molecules. First used in ChEMBL 26.

Check the wiki and paper[1] for a detailed description of the different processes.

Installation

From source:

git clone https://github.com/chembl/ChEMBL_Structure_Pipeline.git
pip install ./ChEMBL_Structure_Pipeline

with pip:

pip install chembl_structure_pipeline

with conda:

conda install -c conda-forge chembl_structure_pipeline

Usage

Standardise a compound (info)

from chembl_structure_pipeline import standardizer

o_molblock = """
  Mrv1810 07121910172D          

  4  3  0  0  0  0            999 V2000
   -2.5038    0.4060    0.0000 C   0  0  3  0  0  0  0  0  0  0  0  0
   -2.5038    1.2310    0.0000 O   0  5  0  0  0  0  0  0  0  0  0  0
   -3.2182   -0.0065    0.0000 N   0  3  0  0  0  0  0  0  0  0  0  0
   -1.7893   -0.0065    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  1  3  1  0  0  0  0
  1  4  1  4  0  0  0
M  CHG  2   2  -1   3   1
M  END
"""

std_molblock = standardizer.standardize_molblock(o_molblock)

Get the parent compound (info)

from chembl_structure_pipeline import standardizer

o_molblock = """
  Mrv1810 07121910262D          

  3  1  0  0  0  0            999 V2000
   -5.2331    1.1053    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -4.5186    1.5178    0.0000 N   0  3  0  0  0  0  0  0  0  0  0  0
   -2.8647    1.5789    0.0000 Cl  0  5  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
M  CHG  2   2   1   3  -1
M  END
"""

parent_molblock, _ = standardizer.get_parent_molblock(o_molblock)

Check a compound (info)

The checker assesses the quality of a structure. It highlights specific features or issues in the structure that may need to be revised. Together with the description of the issue, the checker process returns a penalty score (between 0-9) which reflects the seriousness of the issue (the higher the score, the more critical is the issue)

from chembl_structure_pipeline import checker

o_molblock = """ 
  Mrv1810 02151908462D           
 
  4  3  0  0  0  0            999 V2000 
    2.2321    4.4196    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0 
    3.0023    4.7153    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0 
    1.4117    4.5059    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0 
    1.9568    3.6420    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0 
  1  2  1  1  0  0  0 
  1  3  1  0  0  0  0 
  1  4  1  0  0  0  0 
M  END 
"""

issues = checker.check_molblock(o_molblock)

References

[1] Bento, A.P., Hersey, A., Félix, E. et al. An open source chemical structure curation pipeline using RDKit. J Cheminform 12, 51 (2020). https://doi.org/10.1186/s13321-020-00456-1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chembl_structure_pipeline-1.2.4.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chembl_structure_pipeline-1.2.4-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file chembl_structure_pipeline-1.2.4.tar.gz.

File metadata

File hashes

Hashes for chembl_structure_pipeline-1.2.4.tar.gz
Algorithm Hash digest
SHA256 e381500ac815ded31cc4841d1fbfe7c089788a07f2fbaa9bb818b52facfa9030
MD5 4370eebdbb70e690512cd5c321e2d252
BLAKE2b-256 56294ce697880f4b239dc96e89c371d7a86c43cd08e3f9b54c375510f49f19d8

See more details on using hashes here.

File details

Details for the file chembl_structure_pipeline-1.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for chembl_structure_pipeline-1.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1bb5121b3714d564610c55e39e214ae6409001d0a50bb48a00831b3f5573c479
MD5 f78d20e28f082a50985addf41461c8b6
BLAKE2b-256 005cb9d714ff6381793e811ef96daa744350e1c5f233e0f15044e1f6160aeeaa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page