Tools for building Escher-compatible metabolic maps from KEGG and model annotations

These details have not been verified by PyPI

Project description

BioEMMA

BioEMMA is an early-stage Python library for building Escher-compatible metabolic maps from KEGG pathway layouts and genome-scale metabolic models.

The current main workflow is:

Parse a KEGG KGML/XML pathway with KeggMap.
Convert KEGG compounds and reactions to BiGG/SEED identifiers using bundled MetaNetX-derived mapping tables.
Build an Escher JSON map with EscherMapper.
Optionally save a reproducible workflow output directory with the Escher map, the reconstructed KEGG map, flux data, summaries, and merged maps.

The project is currently in alpha. The public API may still change while the package structure is being prepared for PyPI.

Installation

For local development:

pip install -e .

Runtime dependencies can also be installed from:

pip install -r requirements.txt

Basic Usage

The workflow API is the recommended user-facing entry point. It accepts a COBRA model path (or an in-memory cobra.Model) and either a KEGG pathway identifier or a local KGML file.

from bioemma.workflow import build_outputs


result = build_outputs(
    model="path/to/model.xml",
    pathway="rn00010",
    output_dir="out",
    database="BIGG",
    run_fba=True,
)

escher_map = result.escher_map
kegg_reconstruction = result.kegg_reconstruction

escher_map is a Python object compatible with the Escher JSON map structure, and kegg_reconstruction is a normalized analytical representation of the KEGG layout and mapped identifiers. When save_kegg_map=True, BioEMMA also writes kegg_escher_map.json: a pure KEGG-layout Escher map before model filtering or secondary metabolite addition.

With output_dir, BioEMMA writes:

out/rn00010/
  escher_map.json
  kegg_escher_map.json     # when save_kegg_map=True or --save-kegg-map
  kegg_source_reconstruction.json
  summary.json
  fluxes.json              # when fluxes are provided or run_fba=True
  escher_map.html          # when save_html=True
  escher_map_with_fluxes.html  # when flux data and HTML output are requested

HTML output requires the escher package. BioEMMA does not export PNG files directly; open the HTML output in Escher and use Escher's built-in PNG export when a raster image is needed.

Visualization layout settings can be tuned with VisualizationOptions:

from bioemma.workflow import build_outputs
from bioemma.visualization import VisualizationOptions


result = build_outputs(
    model="path/to/model.xml",
    pathway="rn00010",
    output_dir="out",
    visualization_options=VisualizationOptions(
        scaling_factor=4,
        axis_epsilon=2,
        markers_dist=10,
        metabolite_label_shift=(10, 10),
        reaction_label_shift=(10, 10),
        canvas_margin_x=160,
        canvas_margin_y=160,
        axis_offset=20,
    ),
)

The defaults are conservative starting values for KEGG layouts: coordinates are scaled up for Escher readability, aligned reaction lanes keep a small tolerance, and secondary metabolites get enough spacing after scaling.

Command Line Usage

Build one map from a KEGG pathway identifier:

bioemma build --model path/to/model.xml --pathway rn00010 --output-dir out

Build one map from a local KGML file:

bioemma build --model path/to/model.xml --kgml path/to/rn00010.xml --output-dir out

Build multiple maps and merge them:

bioemma build --model path/to/model.xml --pathway rn00010 rn00020 --output-dir out

The same works with local KGML files:

bioemma build --model path/to/model.xml --kgml path/to/rn00010.xml path/to/rn00020.xml --output-dir out

For multiple inputs, BioEMMA writes each individual map into its own subfolder and writes a merged Escher map at:

out/merged_escher_map.json

Use --no-merge to skip the merged map.

The legacy single-file JSON output is still available:

bioemma build --model path/to/model.xml --kgml path/to/rn00010.xml --output escher_map.json

summary.json includes map_stats, a stage-by-stage count of total elements, nodes, reactions, and segments added or removed while the map is built. To print the same reduction statistics in the CLI, add --map-stats:

bioemma build --model path/to/model.xml --kgml path/to/rn00010.xml --output-dir out --map-stats

To save the unfiltered KEGG Escher map next to the normal model-derived map, add --save-kegg-map:

bioemma build --model path/to/model.xml --kgml path/to/rn00010.xml --output-dir out --save-kegg-map

The same visualization settings are available in the CLI, for example:

bioemma build --model path/to/model.xml --kgml path/to/rn00010.xml --output-dir out --scaling-factor 4 --canvas-margin-x 160 --canvas-margin-y 160

If cobrapy cannot access its default cache directory on Windows, set a local cache directory before running tests or CLI commands:

set BIOEMMA_COBRA_CACHE_DIR=%CD%\.cobra-cache

Included Mapping Data

BioEMMA currently bundles two compact runtime mapping files:

metabolite_mapping.tsv
reaction_mapping.tsv

These files are derived from MetaNetX cross-reference tables and are used to map KEGG identifiers to BiGG and SEED identifiers. The large raw MetaNetX download cache is not intended to be included in the Python package.

See NOTICE.md for third-party data attribution and usage notes.

License

BioEMMA's source code is distributed under the MIT License. Bundled mapping data are derived from third-party database resources and may be subject to their own license terms. See LICENSE and NOTICE.md.

By default, the workflow keeps the KEGG reactions and compounds that can be matched to the COBRA model. To preserve KEGG-only elements that are not present in the model, pass include_kegg_only=True in Python or use --include-kegg-only in the CLI.

Development Notes

The package code lives in:

src/bioemma/

The current core modules are:

bioemma.maps.KeggMap
bioemma.mapper_base.EscherMapper
bioemma.metanetx_mapper.MetaNetXMapper
bioemma.merger.EscherMerger
bioemma.workflow.build_outputs
bioemma.workflow.build_many_outputs

The script for regenerating mapping tables is kept separately in:

scripts/prepare_db_mapping.py

Run the test suite from a source checkout with:

set PYTHONPATH=%CD%\src
set BIOEMMA_COBRA_CACHE_DIR=%CD%\.pytest-cobra-cache
python -m pytest -q

Publishing

Before publishing, bump the version in pyproject.toml, run tests, and build fresh distribution artifacts:

python -m pip install --upgrade build twine
rmdir /s /q dist
python -m build
python -m twine check dist/*

Upload to TestPyPI first:

python -m twine upload --repository testpypi dist/*

Install from TestPyPI in a clean environment and smoke-test the CLI. Then upload the same checked artifacts to PyPI:

python -m twine upload dist/*

Status

BioEMMA is not yet a stable release. Before publishing to PyPI, the package still needs a final check of bundled data, license compatibility, and user-facing visualization dependencies.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

May 19, 2026

0.1.1

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioemma-0.2.0.tar.gz (1.2 MB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bioemma-0.2.0-py3-none-any.whl (1.2 MB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file bioemma-0.2.0.tar.gz.

File metadata

Download URL: bioemma-0.2.0.tar.gz
Upload date: May 19, 2026
Size: 1.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for bioemma-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`94293bebdf962e52614f2daee44bd31943824e472f376b131d2fcb40ba939a9f`
MD5	`cde41781daf29cd41466f0a9f2a02076`
BLAKE2b-256	`5c22abc81d6814746e1793ec00a83f35723021fd13981a4c0efca77adc1972fd`

See more details on using hashes here.

File details

Details for the file bioemma-0.2.0-py3-none-any.whl.

File metadata

Download URL: bioemma-0.2.0-py3-none-any.whl
Upload date: May 19, 2026
Size: 1.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for bioemma-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3cb4c9cf29e8440cf5e723f718d231a5c52ebcf4bb23bd27eeebe7f06fca5558`
MD5	`ab0fdc7c601499d0b40f7addb0c32eb6`
BLAKE2b-256	`a3b377ba767135f817ed05877ad4452e24cb44ceca116c27478716fb25153678`

See more details on using hashes here.

bioemma 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

BioEMMA

Installation

Basic Usage

Command Line Usage

Included Mapping Data

License

Development Notes

Publishing

Status

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes