A Python framework for discovering isoform-switch and splicing modules from bulk RNA-seq by combining gene-local compositional modeling with splice-graph-aware latent network inference.
Project description
IsoGraph
IsoGraph is a Python research software package for discovering isoform-switch and splicing modules from bulk RNA-seq. It combines gene-local compositional modeling with network inference so researchers can move from transcript-level counts to gene-module structure, trait associations, and reproducible benchmark artifacts.
Core Capabilities
- Generate and benchmark against the permanent
core_v1fixture suite and the large-scalescale_v1suite (6k–12k genes, 25:1–50:1 genes-to-samples ratios). - Freeze the bundled
real_caudate_aa_v1real-data fixture from local BrainSeq inputs. - Fit any backend on a prepared dataset bundle via
isograph fit(VAE default). - Run
baseline,latent,graph,vae, orwgcnabackends programmatically or through the benchmark runner. - Export reproducible artifacts, benchmark reports, calibration summaries, and snapshot comparisons.
- Explain discovered modules at transcript-feature resolution using
isograph explain-module: gene drivers, transcript polarity, high-vs-low contrasts, publication-ready plots, optional VAE decoder attribution, and Captum Integrated Gradients encoder attribution. - Annotate transcript switch pairs with GTF-derived structural labels (exon changes, CDS/UTR
shifts, biotype switches) using
isograph annotate-structure.
Installation
Install the core package from PyPI:
pip install isograph
The core package supports Python 3.11 through 3.14.
Optional backends
IsoGraph installs mpmath, which is required by modern SymPy releases. The
vae backend also requires PyTorch, but PyTorch is intentionally not installed
by IsoGraph because CPU/GPU/CUDA builds are platform-specific. Install the build
that matches your system before using it:
pip install torch
See the PyTorch installation guide for GPU/CUDA builds.
The wgcna backend requires R with the WGCNA package and Rscript on PATH.
Quickstart
Run a minimal benchmark on the bundled toy fixture (VAE is the default backend):
isograph benchmark -- \
fixture_filter=toy_v1 \
stage_name=readme_smoke
This writes benchmark artifacts under artifacts/benchmarks/readme_smoke/toy_v1/ and
JSON reports under artifacts/reports/.
Using Your Own Data
IsoGraph expects a prepared dataset bundle containing a manifest.json, aligned sample
metadata, feature tables, and dense count matrices. The fit command supports all
backends; VAE is the default:
isograph fit \
--dataset-path path/to/my_dataset_bundle \
--output-dir artifacts/fits/my_dataset
To switch backends or pass Hydra overrides:
isograph fit \
--dataset-path path/to/my_dataset_bundle \
--backend baseline \
--output-dir artifacts/fits/my_dataset_baseline
isograph fit \
--dataset-path path/to/my_dataset_bundle \
--backend vae \
--output-dir artifacts/fits/my_dataset_vae \
-- vae.hidden_dim=256 vae.n_epochs=400
After fitting, explain one or more modules:
isograph explain-module \
--artifact-dir artifacts/fits/my_dataset \
--feature-table features.parquet \
--feature-meta feature_metadata.parquet \
--module-ids M000 M001 \
--plot \
--output-dir artifacts/explain/my_dataset
The detailed walkthroughs live in the Wiki, and the formal data model is documented in the RTD source tree.
Documentation
- Reference docs for publication on Read the Docs live in docs.
- Step-by-step tutorials for installation, data preparation, and own-data workflows live in the GitHub Wiki.
- Project planning and staged development history remain in docs/staged-roadmap.md.
Citation
If you use IsoGraph in research, cite the software repository using the metadata in CITATION.cff. If a manuscript or preprint becomes available later, that can be added as a preferred citation target without changing the software citation path.
Acknowledgements
IsoGraph is supported by the National Institute on Minority Health and Health Disparities
award R00 MD0169640 and the Alzheimer's Association award 25AARG-1413315.
Reproducibility and Data Provenance
- The benchmark suite is fixture-driven and designed to preserve regression targets across development stages.
- The bundled real-data workflow freezes a reproducible
real_caudate_aa_v1dataset from local BrainSeq-derived inputs and caches intermediate selections underbenchmarks/cache/real_data/. - Benchmark, calibration, runtime, and snapshot artifacts are written into versioned
directories under
artifacts/andsnapshots/.
Limitations
- The benchmark CLI is optimized for the bundled fixture suite rather than arbitrary user-defined suites.
- The VAE backend requires a separate PyTorch installation.
- The WGCNA backend requires R with the
WGCNApackage installed. - The
freeze-realworkflow depends on local BrainSeq-style source files and is not a generic data-ingestion command for arbitrary cohorts. - VAE decoder attribution (
--vae-attribution) and Captum Integrated Gradients (--integrated-gradients) require a VAE checkpoint in the fit artifact directory and, for Integrated Gradients,pip install isograph[torch-explain].
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isograph-0.1.3.tar.gz.
File metadata
- Download URL: isograph-0.1.3.tar.gz
- Upload date:
- Size: 81.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.3 CPython/3.14.4 Linux/6.19.10-arch1-1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fcb894e8e6bfd7118f311f923799ab170f81ad7a5163c5514c968d71dc38d30c
|
|
| MD5 |
fd77eccb9572f2883bd21a47a635f6dd
|
|
| BLAKE2b-256 |
03eb44a7190ce0d0d09082b94c94645735499e2768e0b12dd771a60c08b0fd33
|
File details
Details for the file isograph-0.1.3-py3-none-any.whl.
File metadata
- Download URL: isograph-0.1.3-py3-none-any.whl
- Upload date:
- Size: 100.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.3 CPython/3.14.4 Linux/6.19.10-arch1-1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c470e5544ca1d089a4d9eea76ed40ca7581e209b63a7d28abe5d23eb3f35c154
|
|
| MD5 |
0da0901ee933180ef77534c068cd6831
|
|
| BLAKE2b-256 |
9c90a74103b5e0ec91172891f561bb22be4e04e6a229b2197b6f27402530ec16
|