Skip to main content

Processing and integrating data with genome-scale metabolic models (GEM)

Project description

PipeGEM v0.2.0

PyPI pyversions License: GPL v3 ci codecov


PipeGEM is a Python package for analyzing and visualizing multiple genome-scale metabolic models (GEMs). It supports the integration of transcriptomic and proteomic data, metabolic task evaluation, and medium composition into GEMs. Flux analysis is powered by cobrapy.

Documentation: pipegem.readthedocs.io


Installation

pip

pip install pipegem

uv

uv add pipegem

uv (development)

git clone https://github.com/qwerty239qwe/pipeGEM.git
cd pipeGEM
uv sync

Documentation build

uv run --locked --extra doc mkdocs build --strict -d ./docs

Python API

Single model

import pipeGEM as pg
from pipeGEM.utils import load_model

model = load_model("your_model_path")  # returns a cobra.Model
pmodel = pg.Model(name_tag="model_name", model=model)

print(pmodel)

flux_analysis = pmodel.do_flux_analysis("pFBA")
flux_analysis.plot(
    rxn_ids=['rxn_a', 'rxn_b'],
    file_name='pfba_flux.png'  # pass None to skip saving
)

Multiple models

import pipeGEM as pg
from pipeGEM.utils import load_model

group = pg.Group(
    {
        "group_a": {
            "model_a_dmso": load_model("path_1"),
            "model_a_metformin": load_model("path_2"),
        },
        "group_b": {
            "model_b_dmso": load_model("path_3"),
            "model_b_metformin": load_model("path_4"),
        },
    },
    name_tag="my_group",
    treatments={
        "model_a_dmso": "DMSO",
        "model_b_dmso": "DMSO",
        "model_a_metformin": "metformin",
        "model_b_metformin": "metformin",
    },
)

flux_analysis = group.do_flux_analysis("pFBA")
flux_analysis.plot(rxn_ids=['rxn_a', 'rxn_b'])

Context-specific models from omic data

PipeGEM can reconstruct context-specific GEMs by integrating gene expression data. The example below uses GIMME, but a range of algorithms are available.

import numpy as np
import pipeGEM as pg
from pipeGEM.utils import load_model
from pipeGEM.data import GeneData, synthesis

mod = pg.Model(name_tag="model_name", model=load_model("your_model_path"))

# Generate synthetic transcriptomic data for demonstration
dummy_data = synthesis.get_syn_gene_data(mod, n_sample=3)

gene_data = GeneData(
    data=dummy_data["sample_0"],
    data_transform=lambda x: np.log2(x),
    absent_expression=-np.inf,
)
mod.add_gene_data(
    name_or_prefix="sample_0",
    data=gene_data,
    or_operation="nanmax",  # alternative: "nansum"
    threshold=-np.inf,
    absent_value=-np.inf,
)

gimme_result = mod.integrate_gene_data(
    data_name="sample_0",
    integrator="GIMME",
    high_exp=5 * np.log10(2),
)
context_specific_gem = gimme_result.result_model

Supported integrators: GIMME, iMAT, FASTCORE, SWIFTCORE, MBA, mCADRE, CORDA, ftINIT, RIPTiDe, E-Flux, SPOT, rFASTCORMICS.

Enzyme-constrained models (GECKO)

After attaching enzyme kinetic data, call integrate_enzyme_data to produce an enzyme-constrained model.

mod.add_enzyme_data(enzyme_data)  # EnzymeData object

ec_result = mod.integrate_enzyme_data(method="GECKOLight")  # or "GECKOFull"
ec_model = ec_result.result_model

Logging

PipeGEM is silent by default. To enable progress output, adjust the log level before running analyses:

import logging
import pipeGEM as pg

pg.set_log_level(logging.INFO)  # show progress messages
pg.enable_verbose()             # enable DEBUG output with a StreamHandler to stderr

CLI

PipeGEM provides a command-line interface organized around subcommands. To see all available options:

pipeGEM --version
pipeGEM --help

Step 1 — Generate template config files

pipeGEM template -p integration -o ./configs

This creates a configs/ directory containing TOML templates for each required config file.

Step 2 — Edit the configs

Fill in your model paths, data paths, and algorithm parameters in the generated TOML files.

Step 3 — Run a pipeline

Add --dry-run to any command to validate the configs and preview the planned actions without executing them.

# Process a model
pipeGEM process -t configs/model_conf.toml

# Find expression thresholds
pipeGEM threshold -g configs/gene_data_conf.toml -r configs/threshold_conf.toml

# Full context-specific model reconstruction
pipeGEM integrate \
    -g configs/gene_data_conf.toml \
    -t configs/model_conf.toml \
    -r configs/threshold_conf.toml \
    -m configs/mapping_conf.toml \
    -i configs/integration_conf.toml

# Flux analysis
pipeGEM flux -f configs/flux_conf.toml -t configs/model_conf.toml

# Compare models across conditions
pipeGEM compare -c configs/comparison_conf.toml

Note: The legacy -n <pipeline> style is still accepted for backward compatibility but is deprecated. Please migrate to the subcommand style shown above.


What's new in 0.2.0

  • CLI subcommands — the flat -n <pipeline> interface has been replaced with proper subcommands (integrate, process, threshold, flux, compare, template). The old style still works but emits a deprecation warning.
  • --dry-run flag — available on all subcommands; validates configs and prints the planned actions without running the pipeline.
  • integrate_enzyme_data() — now fully implemented. Accepts method="GECKOLight" (default) or "GECKOFull".
  • Silent by default — all internal print calls have been replaced with structured loggers under the pipeGEM namespace. Use pg.set_log_level or pg.enable_verbose to opt in to output.
  • Bug fixes:
    • Model.rename() silently swallowed a TypeError when given a non-string argument — it now raises correctly.
    • PairwiseTester always selected non-parametric methods regardless of the normality test result.
    • data.preprocessing: column drops were incorrectly targeting rows; a row-wise apply was missing axis=1; na_action="" was invalid and replaced with na_action=None.
    • fetch_HPA_data updated to use the current biodbs API (hpa_search).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pipegem-0.2.0.tar.gz (5.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pipegem-0.2.0-py3-none-any.whl (5.8 MB view details)

Uploaded Python 3

File details

Details for the file pipegem-0.2.0.tar.gz.

File metadata

  • Download URL: pipegem-0.2.0.tar.gz
  • Upload date:
  • Size: 5.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pipegem-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6cd4b9a966200fb4f29c1fbd2ea400f4e01739ffb31808a49c9f38c14982f1c5
MD5 82bd1657d50d2159c1b6d0a078a8d4d1
BLAKE2b-256 affb61580dad8081633f3afa2b6761f058e9eba12a6aea4a962e60715fdea4dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for pipegem-0.2.0.tar.gz:

Publisher: pub.yml on qwerty239qwe/pipeGEM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pipegem-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pipegem-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 5.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pipegem-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9f7d2a651174dc9386beb190af9e82676141bcc580f3f150ad30fd6b40ecea2a
MD5 c7f92f97c23ddcfe13653256f92b2810
BLAKE2b-256 fe08b57df306e08ae0c9ff96d9be08bab2b7f4021f3fb9ed3223aa816009d154

See more details on using hashes here.

Provenance

The following attestation bundles were made for pipegem-0.2.0-py3-none-any.whl:

Publisher: pub.yml on qwerty239qwe/pipeGEM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page