Skip to main content

The ChemRecon library for integration and exploration of interconnected biochemical databases.

Project description

ChemRecon

v. 0.1.5

ChemRecon is a Python library and consolidated meta-database designed to simplify the integration and exploration of biochemical data from a range of sources. It is built from full-database downloads of compounds, reactions, enzymes, molecular structures, and atom-to-atom maps from the following source databases: BiGG, BRENDA, ChEBI, ECMDB, M-CSA, MetaMDB, and PubChem.

Heterogenous data formats were standardized, and relationships within and between these databases were reconstructed in a consistent format. The resulting meta-database is freely accessible online and is complemented by a Python library which allows for easy integration into existing workflows. This enables unified querying of entries from all the source databases, and discovery and visualization of relationships between these entries.

entrygraph

ChemRecon was developed at the Algorithmic Cheminformatics Group, Department of Mathematics and Computer Science, University of Southern Denmark.

Paper

If ChemRecon proves useful to your research, you may want to cite the following paper.

  • ChemRecon: a Consolidated Meta-Database Platform for Biochemical Data Integration

    C. A. Eriksen, J. L. Andersen, R. Fagerberg, D. Merkle

    Arxiv preprint, submitted to Bioinformatics.

    https://doi.org/10.48550/arXiv.2602.12310

Availability and Installation

ChemRecon is available via your Python package manager from the Python Package Index (PyPI): chemrecon It can be installed using pip:

pip install chemrecon

Visualizing entry graphs requires GraphViz to be installed, and for the dot executable, which renders the graphs, to be available on your system's PATH. See the GraphViz Python package for instructions.


Documentation

The documentation, including instructions on usage, tutorials, and complete description covering the types of entries and relations supported, is available on the ChemRecon homepage.

Usage

The following is an example of a typical ChemRecon workflow, producing the graph seen above. For more detailed examples, see the tutorial section of the documentation.

from chemrecon import *

connect_public()

# Perform a database query to find the 'citrate' entry in BiGG.
citrate_entry = find_entry(id_type = C_BIGG, source_id = 'M_cit')

# Define a protocol to find related entries and molecular structures (protocols like this are included)
compound_structure_protocol = ExplorationProtocol(
    relation_types = {CompoundReference, CompoundHasMolStructure, MolStructureStandardization}
)

# Create and expand an entry graph, according to this protocol, by traversing the database.
eg = EntryGraph(initial_entries = {citrate_entry})
explore(eg, compound_structure_protocol, steps = 5)

# Score the molecular structures in the graph according to their 'connectedness'
scorer = Scorer(score_entry_type = MolStructure)
scores = scorer(citrate_entry)  # Result is an OrderedDict

# Draw the graph with these scores, producing the image seen on this page
eg.show(scores = scores)

Database

ChemRecon needs to be connected to a database to function. The easiest is to connect to the public database, hosted by SDU:

connect_public()

Alternatively, a local instance of the database can be hosted via Docker. Instructions are given in the documentation. This has the advantage of lower latency, making queries and entry graph construction faster, and allows adding custom data sources.

Source Databases

ChemRecon contains compound, molecular structure, reaction, atom-to-atom map, and enzyme entries from the following databases.

Source Compound Structure Reaction AAM Enzyme Version
BiGG 20428 - 33942 - 5705 1.6
BRENDA - - 61129 - 8697 2025_1
ChEBI 224485 330207 - - - 2024-05
ECMDB 3760 7517 - - - 2.0
M-CSA - - 1003 342 1003 2024-11
MetaMDB 80815 4392 74520 1003 - 2025-02
MetaNetX 2601834 2297518 143880 - 48175 4.4
PubChem 9031498 5000000 - - - 2024-09

In addition to the source databases, ChemRecon can make use of a greater number of auxiliary databases, including MetaCyc and KEGG. Data from these sources is are not directly included due to being proprietary or difficult to access. However, the source databases contain references to the auxiliary databases, so entries are created which contain only the identifier and no additional information. This allows users to use ChemRecon workflows based on identifiers from a great number of databases, not just the source databases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chemrecon-0.1.5.tar.gz (72.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chemrecon-0.1.5-py3-none-any.whl (107.2 kB view details)

Uploaded Python 3

File details

Details for the file chemrecon-0.1.5.tar.gz.

File metadata

  • Download URL: chemrecon-0.1.5.tar.gz
  • Upload date:
  • Size: 72.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Alpine Linux","version":"3.23.3","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for chemrecon-0.1.5.tar.gz
Algorithm Hash digest
SHA256 e4509a147e6da5c3170560ef72a072b2a7a1a607a4a4f59d966a7247ca8b96e0
MD5 ebce520049f28b325fb85bfe1485e681
BLAKE2b-256 360bf9764964a479f3d0a5a9ae83ef393c973b107bc3f9f3609d52b7bbbbbfd0

See more details on using hashes here.

File details

Details for the file chemrecon-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: chemrecon-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 107.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Alpine Linux","version":"3.23.3","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for chemrecon-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 cef333114321f04986844b13c81830d8a3d42046231ba90c89167d173eb37328
MD5 72e7e0441c8d5d66f137255468cab02d
BLAKE2b-256 c375b65f1de6a6f929bc024e6c080b62be90b9282f572b7db6573581577f6f9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page