Skip to main content

a Python library that automatically annotates metabolomics data with proteomics/metagenomics metadata using existing databases.

Project description

ChemMap

ChemMap is a Python library that tries to bridge the gap from metabolomics to proteomics using existing databases.

Table of Contents
ChemMap in a Nutshell
How to Download
How to Use

ChemMap in a Nutshell

A sketch of the main method of ChemMap can be found on the following diagram.

app_schema.png
Schema showing the workflow of ChemMap

The main functionality of ChemMap, the function map_smiles_to_proteins, accepts a SMILES or a list of them and on the first phase tries to extract PubChem's and ChEBI's chemical identifiers of this molecule using the PUG REST API. Should you select "expand_all" or "expand_pubchem" as parameters of the search_method, ChemMap would then find molecules that are structurally similar using PUG REST API fastsimilarity_2d endpoint, which uses Tanimoto similarity scores. It is noteworthy that in order to extract ChEBI's identifiers at this stage we are relying on them being reported on PubChem, which might not be the case for newly reported ChEBI substances.

On the second phase, if either "expand_all" or "expand_chebi" where selected as input for the parameter search_method. The workflow will use libChEBIpy to find substances that are related to the ones found by one of the following relationships is_conjugate_base_of is_conjugate_acid_of, is_a, is_tautomer_of or is_enantiomer_of.

On the last step, the ChEBI identifiers are used to search for the presence of the compound on a Rhea reaction as a substrate. If we found one, we retrieve the EC Number and UniProt protein identifier, if available. On the background we are using the UniProt SPARQL Endpoint and the fact that Rhea and UniProt are synchronized on every UniProt release (more here).

The output of this process are 3 dataframes that contain, compound data (as explained in the first and second phases), reaction data (last step) and reaction data of similar structures, respectively. Should the to_tsv parameter be passed to the method, the data will then be saved on a folder with name corresponding to the date and time up to the second.

How to Download

This library can be downloaded through pip

pip install chemmap

or by direct clone using

git clone git@github.com:anguera5/ChemMap.git

create a python3.10 environment, with Conda for example.

conda create --name <my-env> python=3.10

activate it and install the local requirements

conda activate <my-env>
cd <path_to_CheMap>/ChemMap
pip install -r requirements.txt

How to Use

A minimal use case would look as follows. We are interested in knowing all the chemical identifiers and its reactions for Aspirin. A quick Google search will show us that the SMILES for Aspirin is CC(=O)OC1=CC=CC=C1C(=O)O

from ChemMap.chem_map import ChemMap

smiles = "CC(=O)OC1=CC=CC=C1C(=O)O"
search_method = "expand_all"
cm = ChemMap()
cm.map_smiles_to_proteins(smiles, search_method="expand_all")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chem_map-0.0.1.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chem_map-0.0.1-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file chem_map-0.0.1.tar.gz.

File metadata

  • Download URL: chem_map-0.0.1.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.10.12 Linux/5.19.0-50-generic

File hashes

Hashes for chem_map-0.0.1.tar.gz
Algorithm Hash digest
SHA256 095d334fd51bcdeddf71e4f5fe437f86c211aac3ffff26e909331b4b22c6d5ac
MD5 70ee4197f0350f026361164a74ee9b39
BLAKE2b-256 636ffbfc9a531dcdeb9e6bf31a5595e7d6ec5fd8cc03e136cd7dcd594d6f8379

See more details on using hashes here.

File details

Details for the file chem_map-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: chem_map-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.10.12 Linux/5.19.0-50-generic

File hashes

Hashes for chem_map-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 554771ea651ff793106574a01e33821edd11dbb69deeaa94d33fd19b2edd5efb
MD5 6d7ceb5aac762f5bf2f1b7c27943f774
BLAKE2b-256 1fd63a19e4987b5d288c488c4b7a2cd8af794e5090b7b057ae0cab40921fa095

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page