Reconstruct metabolic bipartite graph using KEGG
Project description
kegg2bipartitegraph
kegg2bipartitegraph is a Python package to create KEGG graphs. The main idea of this package is to create metabolic graphs from KEGG database according to the ones used in the article Weber Zendrera et al. (2021). In this article, the authors creates the metabolic networks from the organism of KEGG (accessible in this github repository). Using annotation (from EsMeCaTa, eggnog-mapper, KofamKOALA) or a KEGG organism ID, kegg2bipartitegraph maps the EC and reconstruct metabolic networks associated with the organism following the proposal of this article.
Installation
This package can be installed as a python repository with:
pip install -e .
If it becomes mature enough, a pip package could be created.
Usage
It is divided in different parts:
-
kegg2bipartitegraph reference
an optional ones that creates the reference data (especially the universal reference metabolic graphs). By default, these data are precomputed and available within the package located inkegg2bipartitegraph/data/kegg_model
. -
subcommand to reconstruct metabolic graphs from different inputs:
kegg2bipartitegraph reconstruct_from_esmecata
takes as input the annotation output folder from EsMeCaTa and reconstruct the metabolic networks associated with each taxon.kegg2bipartitegraph reconstruct_from_eggnog
takes as input the annotation output file from eggnog-mapper to map the EC to KEGG reactions.kegg2bipartitegraph reconstruct_from_kofamkoala
takes as input the result from KofamKOALA.kegg2bipartitegraph reconstruct_from_organism
takes as input an organism ID from KEGG (such ashsa
for human oreco
for Escherichia coli). You can find the list of the accessile organisms in KEGG website.
Online / Offline requirements
Multiple subcommands can be used to reconstruct draft networks. Some of them required an internet connection to work, you can see which ones in the following table:
Subcommands | Online | Offline |
---|---|---|
reconstruct_from_esmecata | (Mapping of KOs) | X (without mapping KOs) |
reconstruct_from_eggnog | X | |
reconstruct_from_kofamkoala | X | |
reconstruct_from_organism | X | |
reference | X |
Reference model
The kegg2bipartitegraph reference
is to be used only if you want to update the KEGG reference data. First, delete the data contain in kegg2bipartitegraph/data/kegg_model
, then use this command to download all the required data. This step is long, it is advised to not use it.
It will create 4 files:
-
kegg_model.sbml
: a universal graph containing most of the reactions contained in KEGG database. Such as in the graph made by Weber Zendrera et al. (2021), 14 cofactors have been removed (H2O, ATP, ADP, NAD+, NADH, NADP+, NADPH, CO2, ammonia, sulfate, thioredoxin, phosphate, pyrophosphate (PPi), and H+). Also the stoechiometry is simplified as these metabolic networks are created in order to be used in topological analysis. So they are not supposed to be used with other methods (such as Constraint-Based Modelling). -
several mapping files to go from annotation (especially EC number) to KEGG reactions:
kegg_compound_name.tsv
,kegg_mapping.tsv
andkegg_pathways.tsv
.
Output files of other command
The other subcommands will reconstruct draft metabolic networks by mapping the annotation with the metabolic graphs contained in kegg2bipartitegraph.
Then it will create multiple files:
-
a sbml file containing the metabolic network that can be used with topological analysis methods (such as MeneTools, MiSCoTo or Metage2Metabo).
-
a graphml file containing the metabolic network as a bipartite graph. At this moment, it is not used, but I am currently adaptating the scope method of Weber Zendrera et al. (2021) to automatise its use with this package.
-
tsv files indicating the pathways/modules contained in the metabolic networks, their completness ratio and the associated reactions.
-
a tsv file showing KO information if the option has been used.
-
several statistics/metadata/log files.
Citation
At this moment, there are no articles for kegg2bipartitegraph, if you use it and want to cite it, you can cite this GitHub.
Also, please cite the following article:
- the article made by Adèle Weber Zendrera et al. (2021) that proposed this method:
Weber Zendrera, A., Sokolovska, N. & Soula, H.A. Functional prediction of environmental variables using metabolic networks. Scientific Reports 11, 12192 (2021). https://doi.org/10.1038/s41598-021-91486-8
- the
KEGG database
:
Kanehisa, M., Furumichi, M., Sato, Y., Kawashima, M., Ishiguro-Watanabe, M. KEGG for taxonomy-based analysis of pathways and genomes, Nucleic Acids Research, Volume 51, Issue D1, Pages D587–D592 (2023). https://doi.org/10.1093/nar/gkac963
Kanehisa, M., Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes, Nucleic Acids Research, Volume 28, Issue 1, Pages 27–30 (2000). https://doi.org/10.1093/nar/28.1.27
Kanehisa, M. Toward understanding the origin and evolution of cellular organisms. Protein Science. 28: 1947–1951 (2019). https://doi.org/10.1002/pro.3715
bioservices
for the query on KEGG:
Cokelaer, T., Pultz, D., Harder, L., M., Serra-Musach, J., Saez-Rodriguez, J., BioServices: a common Python package to access biological Web Services programmatically, Bioinformatics, Volume 29, Issue 24, Pages 3241–3242 (2013). https://doi.org/10.1093/bioinformatics/btt547
libsbml
for the handling of the SBML:
Bornstein B. J., Keating S. M., Jouraku, A., Hucka, M., LibSBML: an API Library for SBML, Bioinformatics, Volume 24, Issue 6, Pages 880–881 (2008). https://doi.org/10.1093/bioinformatics/btn051
networkx
for the creation of the graphml:
Hagberg A. A., Schult D. A., Swart P. J. Exploring Network Structure, Dynamics, and Function using NetworkX, in: Varoquaux, G., Vaught, T., Millman, J. (Eds.), . Presented at the Proceedings of the Python in Science Conference (SciPy) 2008. 11–15. http://conference.scipy.org/proceedings/SciPy2008/paper_2/
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file kegg2bipartitegraph-0.0.1.tar.gz
.
File metadata
- Download URL: kegg2bipartitegraph-0.0.1.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 752575dc7855c7c4fbb2821f5fb211999f98399ed08ccc2a592936ac22bb9ada |
|
MD5 | afe2555e76a8a1f4d6ee23c5b740f532 |
|
BLAKE2b-256 | e0766dbe69ef48e5d079046fd2ef16183d5c1184ff5632824a5661f3568a2297 |
File details
Details for the file kegg2bipartitegraph-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: kegg2bipartitegraph-0.0.1-py3-none-any.whl
- Upload date:
- Size: 1.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eef54b83d3fb661ff281e77ad0c941f1f53769a624745c50e61dd4182050424c |
|
MD5 | 2c45a58b80236600ef74fc0caf0ea1d8 |
|
BLAKE2b-256 | faf07c8a7cd8544284dc2b5f429bdb7ad1220023dc005df0ca98b6b90f681fed |