a Python library that automatically annotates metabolomics data with proteomics/metagenomics metadata using existing databases.
Project description
ChemMap
ChemMap is a Python library that tries to bridge the gap from metabolomics to proteomics using existing databases.
| Table of Contents |
|---|
| ChemMap in a Nutshell |
| How to Download |
| How to Use |
ChemMap in a Nutshell
A sketch of the main method of ChemMap can be found on the following diagram.
| Schema showing the workflow of ChemMap |
The main functionality of ChemMap, the function map_smiles_to_proteins, accepts a
SMILES or a list of them and on the first phase
tries to extract PubChem's and ChEBI's chemical identifiers of this molecule using the
PUG REST API. Should you select "expand_all"
or "expand_pubchem" as parameters of the search_method, ChemMap would then find molecules that are structurally
similar using PUG REST API fastsimilarity_2d endpoint, which uses Tanimoto similarity scores. It is noteworthy that
in order to extract ChEBI's identifiers at this stage we are relying on them being reported on PubChem, which might not
be the case for newly reported ChEBI substances.
On the second phase, if either "expand_all" or "expand_chebi" where selected as input for the parameter
search_method. The workflow will use libChEBIpy to find substances that are
related to the ones found by one of the following relationships is_conjugate_base_of is_conjugate_acid_of, is_a,
is_tautomer_of or is_enantiomer_of.
On the last step, the ChEBI identifiers are used to search for the presence of the compound on a Rhea reaction as a substrate. If we found one, we retrieve the EC Number and UniProt protein identifier, if available. On the background we are using the UniProt SPARQL Endpoint and the fact that Rhea and UniProt are synchronized on every UniProt release (more here).
The output of this process are 3 dataframes that contain, compound data (as explained in the first and second phases),
reaction data (last step) and reaction data of similar structures, respectively. Should the to_tsv parameter
be passed to the method, the data will then be saved on a folder with name corresponding to the date and time up to the
second.
How to Download
This library can be downloaded through pip
pip install chemmap
or by direct clone using
git clone git@github.com:anguera5/ChemMap.git
create a python3.10 environment, with Conda for example.
conda create --name <my-env> python=3.10
activate it and install the local requirements
conda activate <my-env>
cd <path_to_CheMap>/ChemMap
pip install -r requirements.txt
How to Use
A minimal use case would look as follows. We are interested in knowing all the chemical identifiers and its reactions
for Aspirin. A quick Google search will show us that the SMILES for Aspirin is CC(=O)OC1=CC=CC=C1C(=O)O
from ChemMap.chem_map import ChemMap
smiles = "CC(=O)OC1=CC=CC=C1C(=O)O"
search_method = "expand_all"
cm = ChemMap()
cm.map_smiles_to_proteins(smiles, search_method="expand_all")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chem_map-0.0.1.tar.gz.
File metadata
- Download URL: chem_map-0.0.1.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.10.12 Linux/5.19.0-50-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
095d334fd51bcdeddf71e4f5fe437f86c211aac3ffff26e909331b4b22c6d5ac
|
|
| MD5 |
70ee4197f0350f026361164a74ee9b39
|
|
| BLAKE2b-256 |
636ffbfc9a531dcdeb9e6bf31a5595e7d6ec5fd8cc03e136cd7dcd594d6f8379
|
File details
Details for the file chem_map-0.0.1-py3-none-any.whl.
File metadata
- Download URL: chem_map-0.0.1-py3-none-any.whl
- Upload date:
- Size: 10.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.1 CPython/3.10.12 Linux/5.19.0-50-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
554771ea651ff793106574a01e33821edd11dbb69deeaa94d33fd19b2edd5efb
|
|
| MD5 |
6d7ceb5aac762f5bf2f1b7c27943f774
|
|
| BLAKE2b-256 |
1fd63a19e4987b5d288c488c4b7a2cd8af794e5090b7b057ae0cab40921fa095
|