Complete pathway visualization: KEGG + SBGN + highlighting + splines
Project description
Pathview-plus โ Complete Pathway Visualization
Full-featured Python implementation of R pathview + SBGNview with support for KEGG, Reactome, MetaCyc, and more.
๐ฏ Features
Core Capabilities
- โ KEGG Pathways โ Download and visualize any KEGG pathway
- โ SBGN Pathways โ Support for Reactome, MetaCyc, PANTHER, SMPDB
- โ Multiple Formats โ PNG (native overlay), SVG (vector), PDF (graph layout)
- โ Gene & Metabolite Data โ Overlay expression and abundance data
- โ Multi-Condition โ Visualize multiple experiments side-by-side
- โ ID Conversion โ Automatic mapping: Entrez โ Symbol โ UniProt โ Ensembl
- โ Highlighting โ Post-hoc emphasis of specific nodes/edges/paths
- โ Spline Curves โ Smooth Bezier edge routing
- โ Custom Colors โ Configurable diverging color scales
New in v2.0
- ๐ Full SBGN-ML support โ Parse and render SBGN Process Description files
- ๐ Database integration โ Direct download from Reactome, MetaCyc
- ๐ SVG vector output โ Scalable graphics for web and publication
- ๐ Highlighting system โ ggplot2-style composable modifications
- ๐ Spline rendering โ Cubic Bezier and Catmull-Rom curves
๐ฆ Installation
Quick install
pip install pathview-plus
Custom install
# Clone repository
git clone https://github.com/raw-lab/pathview-plus
cd pathview-plus
# Install dependencies
pip install -r requirements.txt
pip install .
# Or install specific packages
pip install polars numpy matplotlib seaborn Pillow networkx requests
Dependencies:
- Python โฅ 3.10
- polars โฅ 0.19.0
- matplotlib โฅ 3.7.0
- seaborn โฅ 0.12.0
- numpy โฅ 1.24.0
- Pillow โฅ 10.0.0
- networkx โฅ 3.1
- requests โฅ 2.31.0
๐ Quick Start
1. Basic KEGG Pathway
import polars as pl
from pathview import pathview
# Load your data
gene_data = pl.read_csv("gene_expr.tsv", separator="\t")
# Visualize on KEGG pathway
result = pathview(
pathway_id="04110", # Cell cycle
gene_data=gene_data,
species="hsa",
output_format="png"
)
2. Reactome SBGN Pathway
from pathview import download_reactome, parse_sbgn, sbgn_to_df, pathview
# Download Reactome pathway
path = download_reactome("R-HSA-109582") # Hemostasis
# Parse and visualize
pathway = parse_sbgn(path)
node_df = sbgn_to_df(pathway)
# Overlay data
result = pathview(
pathway_id="R-HSA-109582",
gene_data=gene_data,
output_format="svg" # Vector graphics
)
3. Multi-Condition Comparison
# Three experimental conditions
gene_data = pl.DataFrame({
"entrez": ["1956", "2099", "5594", "207"],
"Control": [0.5, -0.3, 1.2, -0.8],
"Treatment_A": [2.1, -1.5, 0.4, 1.3],
"Treatment_B": [1.8, -0.9, 2.3, 0.7],
})
result = pathview(
pathway_id="04010", # MAPK signaling
gene_data=gene_data,
species="hsa",
limit={"gene": 2.5, "cpd": 1.5},
)
# Each node shows 3 color bands (one per condition)
4. Custom Color Schemes
result = pathview(
pathway_id="04151",
gene_data=gene_data,
species="hsa",
low={"gene": "#2166AC", "cpd": "#4575B4"}, # Blue
mid={"gene": "#F7F7F7", "cpd": "#F7F7F7"}, # White
high={"gene": "#D6604D", "cpd": "#B2182B"}, # Red
)
๐ Complete Examples
Example 1: Gene Symbol IDs
gene_data = pl.DataFrame({
"symbol": ["TP53", "EGFR", "KRAS", "PIK3CA", "AKT1"],
"log2fc": [-1.8, 2.4, 1.1, 1.5, 0.9],
})
result = pathview(
pathway_id="04151",
gene_data=gene_data,
species="hsa",
gene_idtype="SYMBOL", # Automatic conversion to Entrez
)
Example 2: Combined Gene + Metabolite
from pathview import sim_mol_data
gene_data = sim_mol_data(mol_type="gene", species="hsa", n_mol=80)
cpd_data = sim_mol_data(mol_type="cpd", n_mol=30)
result = pathview(
pathway_id="00010", # Glycolysis
gene_data=gene_data,
cpd_data=cpd_data,
species="hsa",
low={"gene": "green", "cpd": "blue"},
high={"gene": "red", "cpd": "yellow"},
)
Example 3: SVG Vector Output
result = pathview(
pathway_id="04110",
gene_data=gene_data,
species="hsa",
output_format="svg", # Scalable vector graphics
)
# Output: hsa04110.pathview.svg
# - Scalable without quality loss
# - Smaller file size
# - Editable in Inkscape/Illustrator
Example 4: Graph Layout (No PNG Background)
result = pathview(
pathway_id="04010",
gene_data=gene_data,
species="hsa",
kegg_native=False, # Use NetworkX layout
output_format="pdf",
)
# Output: hsa04010.pathview.pdf
Example 5: Highlighting (API Preview)
from pathview import highlight_nodes, highlight_path
result = pathview("04010", gene_data=data)
# Composable modifications (ggplot2-style)
highlighted = (result
+ highlight_nodes(["1956", "2099"], color="red", width=4)
+ highlight_path(["1956", "2099", "5594"], color="orange"))
highlighted.save("highlighted.png")
Example 6: Spline Curves
from pathview import cubic_bezier, catmull_rom_spline
import matplotlib.pyplot as plt
# Smooth Bezier curve
curve = cubic_bezier((0,0), (1,2), (3,2), (4,0), n_points=100)
plt.plot(curve[:, 0], curve[:, 1], linewidth=2)
plt.title("Bezier Curve Edge Routing")
plt.savefig("bezier_example.png")
Example 7: Batch Processing
pathways = ["04110", "04010", "04151", "00010"]
for pw_id in pathways:
try:
result = pathview(
pathway_id=pw_id,
gene_data=gene_data,
species="hsa",
out_suffix=f"batch_{pw_id}",
)
print(f"โ Completed {pw_id}")
except Exception as e:
print(f"โ Failed {pw_id}: {e}")
๐ฅ๏ธ Command Line Interface
# Basic usage
python pathview_cli.py --pathway-id 04110 --gene-data expr.tsv
# Specify species and ID type
python pathview_cli.py \
--pathway-id 04110 \
--species hsa \
--gene-data expr.tsv \
--gene-idtype SYMBOL
# Custom colors
python pathview_cli.py \
--pathway-id 04010 \
--gene-data expr.tsv \
--low-gene '#2166AC' \
--high-gene '#D6604D' \
--output-format svg
# Simulate data (for testing)
python pathview_cli.py \
--pathway-id 04110 \
--simulate \
--n-sim 200
# Display KEGG legend
python pathview_cli.py --legend
CLI Arguments:
Pathway:
--pathway-id ID KEGG pathway number (e.g., '04110')
Input data:
--gene-data TSV Gene expression file (TSV)
--cpd-data TSV Compound abundance file (TSV)
--gene-idtype TYPE Gene ID type: ENTREZ, SYMBOL, UNIPROT, ENSEMBL
--cpd-idtype TYPE Compound ID type: KEGG, PUBCHEM, CHEBI
Species & paths:
--species CODE KEGG species code (default: hsa)
--kegg-dir DIR Directory for files (default: .)
--out-suffix SUFFIX Output filename suffix (default: pathview)
Rendering:
--kegg-native Use KEGG PNG background (default: True)
--output-format FORMAT Output format: png, pdf, svg (default: png)
--map-symbol Replace Entrez with symbols (default: True)
--node-sum METHOD Aggregation: sum, mean, median, max
--no-signature Suppress watermark
--no-col-key Suppress color legend
Color scale:
--limit-gene FLOAT Color scale limit (default: 1.0)
--bins-gene INT Color bins (default: 10)
--low-gene COLOR Low-end color (default: green)
--mid-gene COLOR Mid-point color (default: gray)
--high-gene COLOR High-end color (default: red)
--low-cpd COLOR Low compound color (default: blue)
--high-cpd COLOR High compound color (default: yellow)
Utilities:
--legend Display KEGG legend and exit
--simulate Generate simulated data
--n-sim INT Number of simulated molecules (default: 200)
๐ Input File Formats
Gene Data (TSV)
First column = gene IDs, remaining columns = numeric expression values.
entrez Control Treatment_A Treatment_B
1956 2.31 0.45 1.82
2099 -1.14 -0.88 0.33
5594 0.72 1.33 -0.51
207 -0.88 1.21 0.94
Gene Symbols
gene_symbol log2fc p_value
TP53 -1.8 0.001
EGFR 2.4 0.0001
KRAS 1.1 0.01
Compound Data (TSV)
kegg abundance
C00031 1.45
C00118 -0.83
C00022 2.11
๐จ Color Scale Configuration
Three-Point Diverging Scale
pathview(
pathway_id="04110",
gene_data=data,
limit={"gene": 2.0, "cpd": 1.5}, # ยฑ2.0 for genes, ยฑ1.5 for compounds
bins={"gene": 20, "cpd": 10}, # Color resolution
low={"gene": "blue", "cpd": "green"},
mid={"gene": "white", "cpd": "gray"},
high={"gene": "red", "cpd": "yellow"},
)
The scale maps:
low valueโlow color(default: green/blue)0โmid color(default: gray)high valueโhigh color(default: red/yellow)
One-Directional Scale
both_dirs={"gene": False, "cpd": False}
# Maps: 0 (mid) โ max (high)
๐๏ธ Supported ID Types
Gene IDs
| Type | Value | Example |
|---|---|---|
| Entrez | ENTREZ |
1956 |
| Symbol | SYMBOL |
EGFR |
| UniProt | UNIPROT |
P00533 |
| Ensembl | ENSEMBL |
ENSG00000146648 |
| KEGG | KEGG |
hsa:1956 |
Compound IDs
| Type | Value | Example |
|---|---|---|
| KEGG | KEGG |
C00031 |
| PubChem | PUBCHEM |
5793 |
| ChEBI | CHEBI |
4167 |
๐งฌ Supported Databases
KEGG
- Format: KGML (XML)
- Species: 500+ organisms
- Download: Automatic via KEGG REST API
- Example:
pathway_id="hsa04110"
Reactome
- Format: SBGN-ML
- Species: Human, mouse, rat, and more
- Download:
download_reactome("R-HSA-109582") - Example: Hemostasis, Immune System, Signaling
MetaCyc
- Format: SBGN-ML
- Coverage: 2,800+ metabolic pathways
- Download:
download_metacyc("PWY-7210") - Example: Pyrimidine biosynthesis
PANTHER
- Format: SBGN-ML
- Coverage: 177 signaling and metabolic pathways
- Note: Manual download required
SMPDB
- Format: SBGN-ML
- Coverage: Small molecule pathways
- Note: Manual download from website
๐๏ธ Architecture
pathview/
โโโ __init__.py # Public API exports
โโโ constants.py # Type definitions
โโโ utils.py # String/numeric utilities
โ
โโโ id_mapping.py # Gene/compound ID conversion
โโโ mol_data.py # Data aggregation, simulation
โ
โโโ kegg_api.py # KEGG REST API
โโโ databases.py # Reactome, MetaCyc downloaders
โ
โโโ kgml_parser.py # KEGG KGML (XML) parser
โโโ sbgn_parser.py # SBGN-ML (XML) parser
โ
โโโ color_mapping.py # Colormaps, node coloring
โโโ node_mapping.py # Map data onto nodes
โ
โโโ rendering.py # PNG/PDF renderers
โโโ svg_rendering.py # SVG vector renderer
โโโ highlighting.py # Post-hoc modifications
โโโ splines.py # Bezier curve math
โ
โโโ pathview.py # Core orchestrator
pathview_cli.py # Command-line interface
requirements.txt # Dependencies
README.md # This file
Module Statistics:
- 15 modules | 3,506 lines of code
- Functional programming style
- Full type hints
- Comprehensive docstrings
๐ง API Reference
Core Function
pathview(
pathway_id: str,
gene_data: Optional[pl.DataFrame] = None,
cpd_data: Optional[pl.DataFrame] = None,
species: str = "hsa",
kegg_dir: Path = ".",
kegg_native: bool = True,
output_format: str = "png", # "png", "pdf", "svg"
gene_idtype: str = "ENTREZ",
cpd_idtype: str = "KEGG",
out_suffix: str = "pathview",
node_sum: str = "sum",
map_symbol: bool = True,
map_null: bool = True,
min_nnodes: int = 3,
new_signature: bool = True,
plot_col_key: bool = True,
# Color scale parameters
limit: dict = {"gene": 1.0, "cpd": 1.0},
bins: dict = {"gene": 10, "cpd": 10},
both_dirs: dict = {"gene": True, "cpd": True},
low: dict = {"gene": "green", "cpd": "blue"},
mid: dict = {"gene": "gray", "cpd": "gray"},
high: dict = {"gene": "red", "cpd": "yellow"},
na_col: str = "transparent",
) -> dict
Data Functions
sim_mol_data(mol_type="gene", species="hsa", n_mol=100, n_exp=1) โ pl.DataFrame
mol_sum(mol_data, id_map, sum_method="sum") โ pl.DataFrame
ID Mapping
id2eg(ids, category, org="Hs") โ pl.DataFrame
eg2id(eg_ids, category="SYMBOL", org="Hs") โ pl.DataFrame
cpd_id_map(in_ids, in_type, out_type="KEGG") โ pl.DataFrame
Parsing
# KEGG
parse_kgml(filepath) โ KGMLPathway
node_info(pathway) โ pl.DataFrame
# SBGN
parse_sbgn(filepath) โ SBGNPathway
sbgn_to_df(pathway) โ pl.DataFrame
Database Downloads
download_kegg(pathway_id, species="hsa", kegg_dir=".") โ dict
download_reactome(pathway_id, output_dir=".") โ Path
download_metacyc(pathway_id, output_dir=".") โ Path
list_reactome_pathways(species="Homo sapiens") โ list[dict]
detect_database(pathway_id) โ str
Highlighting
# API design (full implementation in progress)
result = pathview(...)
highlighted = result + highlight_nodes(["1956", "2099"], color="red")
highlighted.save("output.png")
Splines
cubic_bezier(p0, p1, p2, p3, n_points=50) โ np.ndarray
quadratic_bezier(p0, p1, p2, n_points=50) โ np.ndarray
catmull_rom_spline(points, n_points=50, alpha=0.5) โ np.ndarray
route_edge_spline(source, target, obstacles, mode="orthogonal") โ np.ndarray
bezier_to_svg_path(curve, close=False) โ str
๐ Performance
- KEGG pathways: ~2-5 seconds (download + render)
- SBGN pathways: ~3-8 seconds (more complex)
- Multi-condition: Linear scaling with # conditions
- Batch processing: Parallel processing possible
Optimization tips:
- Cache downloaded files (automatic)
- Use
output_format="svg"for faster rendering - Disable color key for batch jobs:
plot_col_key=False
๐ค Contributing
Contributions welcome! Areas for improvement:
- SBGN rendering โ Improve glyph shape variety
- Edge routing โ Implement A* pathfinding for splines
- Database integration โ Add PANTHER, SMPDB auto-download
- Highlighting โ Wire up image modification backend
- Performance โ Parallel pathway processing
๐ License
Creative Commons Attribution-NonCommercial (CC BY-NC 4.0) โ See LICENSE file
Citations:
If you are publishing results obtained using Pathview-Plus, please cite:
- Pre-Print Pathview-Plus: Figueroa III JL, Brouwer CR, White III RA. 2026. Pathview-plus: unlocking the metabolic pathways from cells to ecosystems. bioRxiv.
If you using the R version please cite:
- Original Pathview R: Luo, W., & Brouwer, C. 2013. Pathview: an R/Bioconductor package for pathway-based data integration and visualization. Bioinformatics, 29(14), 1830โ1831. Pathview
- Original SBGNview R: Shashikant, T., et al. 2022. SBGNview: Data analysis, integration and visualization on all pathways using SBGN. Bioinformatics, 38(11), 3006โ3008. SBGNview
Contributing to Pathview-plus
We welcome contributions of other experts expanding features in Pathview-plus including the R and python versions. Please contact us via support.
๐ Support
- Issues: open an issue.
- Email: Dr. Richard Allen White III
Made with โค๏ธ for the pathway visualization community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pathview_plus-2.0.2.tar.gz.
File metadata
- Download URL: pathview_plus-2.0.2.tar.gz
- Upload date:
- Size: 53.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78cfdddc27d4b28258bd463f952cb8a645b990ccb4be199fe2a889c12f01df77
|
|
| MD5 |
90b3949033ac05c0bf41d48fb4bc5ae0
|
|
| BLAKE2b-256 |
507bdd7f0812182d945c092149ebefd7f7bf01c31806c40f1be0c6a3e148a8db
|
File details
Details for the file pathview_plus-2.0.2-py3-none-any.whl.
File metadata
- Download URL: pathview_plus-2.0.2-py3-none-any.whl
- Upload date:
- Size: 55.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f9992a2bfbf9eb9ed7528a7502b3b51500702e9ef15456a0cb5473bebcc2ee6
|
|
| MD5 |
483994efd1f803e6a09c0052dcaff73f
|
|
| BLAKE2b-256 |
0dd3a6ab7dd83f0286c41496a2dd2343e405ac7c0869b48914f0e0ce893b02ba
|