Skip to main content

A Python framework for discovering isoform-switch and splicing modules from bulk RNA-seq by combining gene-local compositional modeling with splice-graph-aware latent network inference.

Project description

IsoGraph

IsoGraph is a Python research software package for discovering isoform-switch and splicing modules from bulk RNA-seq. It combines gene-local compositional modeling with network inference so researchers can move from transcript-level counts to gene-module structure, trait associations, and reproducible benchmark artifacts.

Status

IsoGraph currently includes completed development stages 0 through 7:

  • Stage 0: package, CLI, config validation, fixtures, and reproducibility infrastructure
  • Stage 1: deterministic baseline network backend
  • Stage 2: latent probabilistic backend (sklearn FA + partial correlation) with stability selection
  • Stage 3: graph-aware backend
  • Stage 4: VAE backend (default production backend)
  • Stage 5: WGCNA comparison benchmark on simulated data
  • Stage 6: large-scale fixtures (6k–12k genes) and VAE architecture scaling
  • Stage 7: GPU-accelerated FA backend (Woodbury identity + BIC component selection)

Core Capabilities

  • Generate and benchmark against the permanent core_v1 fixture suite and the large-scale scale_v1 suite (6k–12k genes, 25:1–50:1 genes-to-samples ratios).
  • Freeze the bundled real_caudate_aa_v1 real-data fixture from local BrainSeq inputs.
  • Fit the deterministic baseline backend from the command line on a prepared dataset bundle.
  • Run baseline, latent, graph, vae, wgcna, or gpu_latent backends programmatically or through the benchmark runner.
  • Export reproducible artifacts, benchmark reports, calibration summaries, and snapshot comparisons.

Installation

The repository ships with a conda environment that installs IsoGraph in editable mode:

conda env create -f environment.yml
conda activate isograph
isograph --help

If conda is not initialized in the current shell, run eval "$(conda shell.bash hook)" first or initialize conda for your shell.

The core package supports Python 3.11 through 3.14. The bundled environment uses Python 3.11 as the canonical local development runtime.

Quickstart

Run a minimal benchmark on the bundled toy fixture (VAE is the default backend):

conda activate isograph
isograph benchmark -- \
  fixture_filter=toy_v1 \
  stage_name=readme_smoke

This writes benchmark artifacts under artifacts/benchmarks/readme_smoke/toy_v1/ and JSON reports under artifacts/reports/.

Using Your Own Data

IsoGraph expects a prepared dataset bundle containing a manifest.json, aligned sample metadata, feature tables, and dense count matrices. The current command-line path for custom data is:

isograph fit \
  --dataset-path path/to/my_dataset_bundle \
  --output-dir artifacts/fits/my_dataset

At present, fit runs the deterministic baseline backend. For latent, graph, or VAE backends on your own bundle, use the Python API directly. The detailed walkthroughs live in the Wiki, and the formal data model is documented in the RTD source tree.

Documentation

  • Reference docs for publication on Read the Docs live in docs.
  • Step-by-step tutorials for installation, data preparation, and own-data workflows live in the GitHub Wiki.
  • Project planning and staged development history remain in docs/staged-roadmap.md.

Citation

If you use IsoGraph in research, cite the software repository using the metadata in CITATION.cff. If a manuscript or preprint becomes available later, that can be added as a preferred citation target without changing the software citation path.

Acknowledgements

IsoGraph is supported by the National Institute on Minority Health and Health Disparities award R00 MD0169640 and the Alzheimer's Association award 25AARG-1413315.

Reproducibility and Data Provenance

  • The benchmark suite is fixture-driven and designed to preserve regression targets across development stages.
  • The bundled real-data workflow freezes a reproducible real_caudate_aa_v1 dataset from local BrainSeq-derived inputs and caches intermediate selections under benchmarks/cache/real_data/.
  • Benchmark, calibration, runtime, and snapshot artifacts are written into versioned directories under artifacts/ and snapshots/.

Limitations

  • The benchmark CLI is optimized for the bundled fixture suite rather than arbitrary user-defined suites.
  • The fit CLI currently exposes only the baseline backend for custom datasets.
  • The VAE and GPU-latent backends require a separate PyTorch installation.
  • The WGCNA backend requires R with the WGCNA package installed.
  • The bundled freeze-real workflow depends on local BrainSeq-style source files and is not a generic data-ingestion command for arbitrary cohorts.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isograph-0.1.0.tar.gz (61.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isograph-0.1.0-py3-none-any.whl (77.6 kB view details)

Uploaded Python 3

File details

Details for the file isograph-0.1.0.tar.gz.

File metadata

  • Download URL: isograph-0.1.0.tar.gz
  • Upload date:
  • Size: 61.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.14.4 Linux/6.19.10-arch1-1

File hashes

Hashes for isograph-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7edd29ee0790feea9a5b4ef4df57dbc78e36be3bb5b921b3fcce2a0a8c478090
MD5 30614bc32ffd53170e3853e794dc8ee9
BLAKE2b-256 3721fc8c4102457e630b4d3dd57d857222e7f02a79e28e5d6e3ff51abc58982f

See more details on using hashes here.

File details

Details for the file isograph-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: isograph-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 77.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.14.4 Linux/6.19.10-arch1-1

File hashes

Hashes for isograph-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 791dfb14eb08e28c8b126c37760e4ad7c46b092f1983fbd43751daddb00d4f65
MD5 3dcfe681b074d3a960905a5d78d7e1c0
BLAKE2b-256 32853f459ca28f9dd407df42eabf3989554e9c7076e1d380b30dea279ebbd46d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page