Skip to main content

Generate Chemical Checker signatures from molecules SMILES.

Project description

Signaturizer

alt text

Bioactivity signatures are multi-dimensional vectors that capture biological traits of the molecule (for example, its target profile) in a numerical vector format that is akin to the structural descriptors or fingerprints used in the field of chemoinformatics.

Our signaturizers relate to bioactivities of 25 different types (including target profiles, cellular response and clinical outcomes) and can be used as drop-in replacements for chemical descriptors in day-to-day chemoinformatics tasks.

For and overview of the different bioctivity descriptors available please check the original Chemical Checker paper or website

Installation

The only strong dependency for this resource is RDKit which can be installed in a local conda environment.

Conda environment

conda create --no-default-packages -n sign -y python
conda activate sign
conda install -c conda-forge -y rdkit

from PyPI

pip install signaturizer

from Git repository

pip install git+http://gitlabsbnb.irbbarcelona.org/packages/signaturizer.git

Intro

Generating Bioactivity Signatures

from signaturizer import Signaturizer
# load the predictor for B1 space (representing the Mode of Action)
sign = Signaturizer('B1')
# prepare a list of SMILES strings
smiles = ['C', 'CCC']
# run prediction
results = sign.predict(smiles)
print(results.signature)
# [[-0.05777782  0.09858645 -0.09854423 ... -0.04505355  0.09859559
#    0.09859559]
#  [ 0.03842233  0.10035036 -0.10023173 ... -0.07104399  0.10035563
#    0.10035574]
print(results.signature.shape)
# (2, 128)
# or save results as H5 file if you have many molecules
results = sign.predict(smiles, 'destination.h5')

Usage

For an exemplary application please check the ipython notebook in the notebook directory (you can download it and run on Google Colab)

Citing

If you use this resource in the course of your research, please consider citing these papers:

Duran-Frigola M, et al "Extending the small-molecule similarity principle to all levels of biology with the Chemical Checker."" Nature Biotechnology (2020)

You can use this bibtex entry:

@Article{Duran-Frigola2020,
    author={Duran-Frigola, Miquel and Pauls, Eduardo and Guitart-Pla, Oriol and Bertoni, Martino and Alcalde, V{\'i}ctor and Amat, David and Juan-Blanco, Teresa and Aloy, Patrick},
    title={Extending the small-molecule similarity principle to all levels of biology with the Chemical Checker},
    journal={Nature Biotechnology},
    year={2020},
    month={May},
    day={18},
    abstract={Small molecules are usually compared by their chemical structure, but there is no unified analytic framework for representing and comparing their biological activity. We present the Chemical Checker (CC), which provides processed, harmonized and integrated bioactivity data on {\textasciitilde}800,000 small molecules. The CC divides data into five levels of increasing complexity, from the chemical properties of compounds to their clinical outcomes. In between, it includes targets, off-targets, networks and cell-level information, such as omics data, growth inhibition and morphology. Bioactivity data are expressed in a vector format, extending the concept of chemical similarity to similarity between bioactivity signatures. We show how CC signatures can aid drug discovery tasks, including target identification and library characterization. We also demonstrate the discovery of compounds that reverse and mimic biological signatures of disease models and genetic perturbations in cases that could not be addressed using chemical information alone. Overall, the CC signatures facilitate the conversion of bioactivity data to a format that is readily amenable to machine learning methods.},
    issn={1546-1696},
    doi={10.1038/s41587-020-0502-7},
    url={https://doi.org/10.1038/s41587-020-0502-7}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

signaturizer-1.1.1.tar.gz (11.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page