Biological pathway and gene set annotation toolkit
Project description
PathwayDB
A lightweight Python library for querying and storing biological pathway and gene set annotations from major databases.
Perfect for:
- 🧬 Gene set enrichment analysis (GSEA)
- 🔬 Pathway annotation and analysis
- 📊 Functional genomics workflows
- 🧪 Bioinformatics pipelines
- 📈 Integration with pandas/R for downstream analysis
Why PathwayDB?
- No Dependencies Hassle: Pure Python stdlib - no compilation, no conflicts, works everywhere
- Offline-First: Download once, query forever - perfect for HPC clusters without internet
- Fast: Millisecond queries on local SQLite databases
- DataFrame-Friendly: Export directly to pandas format for analysis (like clusterProfiler in R)
- Simple API: Intuitive methods that feel natural for bioinformaticians
- Well-Documented: Clear examples and comprehensive documentation
Table of Contents
- What's New
- Features
- Installation
- Quick Start
- DataFrame Export
- Database Information
- Advanced Usage
- Documentation
What's New in v0.2.0
🎉 Major update with game-changing features!
-
🔍 Search by Description: Filter pathways/terms by name instead of remembering IDs
# KEGG cancer = kegg.filter(pathway_name='cancer') # GO dna_repair = go.filter(term_name='DNA repair') # MSigDB apoptosis = msigdb.filter(gene_set_name='apoptosis')
-
⚡ Instant GO Term Names: ~1.5 MB mapping bundled with package - no downloads needed!
go.download_annotations(species='human') # Term names included automatically!
-
💾 Centralized Caching: Download once, use across all projects
go = GO.from_cache(species='human') # Loads from shared cache
-
📊 Complete DataFrame Export: All databases support pandas-compatible export
df_data = kegg.to_dataframe() # GeneID, PATH, Annot df_data = go.to_dataframe() # GeneID, TERM, Aspect, Evidence df_data = msigdb.to_dataframe() # GeneID, GeneSet, Collection, Description
See CHANGELOG.md for complete details.
Features
- ✅ Multiple Database Support: KEGG, Gene Ontology (GO), and MSigDB
- ✅ Zero External Dependencies: Uses only Python standard library
- ✅ Description-Based Filtering: Search by pathway/term names, not just IDs
- ✅ Bundled GO Term Names: ~1.5 MB mapping included for instant term name access
- ✅ Local SQLite Storage: Download once, query offline forever
- ✅ DataFrame Export: Export to pandas-compatible format (like clusterProfiler)
- ✅ Smart Caching: HTTP response caching and centralized annotation cache
- ✅ Rate Limiting: Built-in rate limiting for respectful API usage
- ✅ Gene ID Conversion: Convert between Entrez, Symbol, Ensembl, and UniProt IDs
- ✅ Fast Queries: Millisecond-level queries on local databases
Installation
From Source
git clone https://github.com/guokai8/pathwaydb.git
cd pathwaydb
pip install -e .
From PyPI (coming soon)
pip install pathwaydb
Quick Start
KEGG Pathways
from pathwaydb import KEGG
# Initialize KEGG client with local storage
kegg = KEGG(species='hsa', storage_path='kegg_human.db')
# Download all pathway annotations (first time only)
# Automatically includes pathway hierarchy (Level1, Level2, Level3)!
kegg.download_annotations()
# Output: Downloaded 8,000+ pathway-gene annotations
# Downloading KEGG pathway hierarchy...
# ✓ Updated 354 pathways with hierarchy information
kegg.convert_ids_to_symbols()
# Query pathways for a specific gene
results = kegg.query_by_gene('TP53')
print(f"TP53 is in {len(results)} pathways")
# Output: TP53 is in 73 pathways
for pathway in results[:3]:
print(f" {pathway.pathway_id}: {pathway.pathway_name}")
# Output:
# hsa05200: Pathways in cancer
# hsa04115: p53 signaling pathway
# hsa04110: Cell cycle
# Filter by pathway name (case-insensitive substring match)
cancer_pathways = kegg.filter(pathway_name='cancer')
print(f"Found {len(cancer_pathways)} cancer-related annotations")
# Output: Found 2,389 cancer-related annotations
# Combine filters: specific gene + pathway name
tp53_cancer = kegg.filter(gene_symbols=['TP53'], pathway_name='cancer')
print(f"TP53 in {len(tp53_cancer)} cancer pathway annotations")
# Output: TP53 in 15 cancer pathway annotations
# Get database statistics
stats = kegg.stats()
print(stats)
# Output: {'total_annotations': 8234, 'unique_genes': 7894, 'unique_pathways': 354}
# Export to DataFrame format (includes hierarchy!)
df_data = kegg.to_dataframe()
# Returns: [{'GeneID': 'TP53', 'PATH': 'hsa05200', 'Annot': 'Pathways in cancer',
# 'Level1': 'Human Diseases', 'Level2': 'Cancer: overview', 'Level3': 'Pathways in cancer'}, ...]
KEGG Pathway Hierarchy
Pathway annotations include hierarchical classification from KEGG BRITE:
| Level | Description | Example |
|---|---|---|
| Level1 | Top-level category | Metabolism, Human Diseases, Cellular Processes |
| Level2 | Sub-category | Carbohydrate metabolism, Cancer, Cell growth |
| Level3 | Pathway name | Glycolysis, Pathways in cancer, Cell cycle |
# Access hierarchy in DataFrame export
import pandas as pd
df = pd.DataFrame(kegg.to_dataframe())
print(df[['GeneID', 'PATH', 'Level1', 'Level2']].head())
# GeneID PATH Level1 Level2
# 0 TP53 hsa05200 Human Diseases Cancer: overview
# 1 TP53 hsa04115 Human Diseases Cancer: specific types
# 2 TP53 hsa04110 Cellular Processes Cell growth and death
### Gene Ontology (GO)
PathwayDB offers multiple ways to build and use GO annotations:
#### Method 1: Download Fresh (Recommended for Production)
```python
from pathwaydb import GO
# Initialize GO client with local storage
go = GO(storage_path='go_human.db')
# Download GO annotations (first time only)
# Term names are automatically populated!
go.download_annotations(species='human')
# Output: Downloading GO annotations for human...
# Populating names for 18,000+ GO terms...
# ✓ All GO term names populated successfully!
# Check database statistics
print(go.stats())
# {'total_annotations': 500000+, 'unique_genes': 20000+, 'unique_terms': 18000+}
Method 2: Use Centralized Cache (Share Across Projects)
from pathwaydb import GO
# Load from cache - downloads automatically if not cached
go = GO.from_cache(species='human')
# Uses ~/.pathwaydb_cache/go_annotations/go_human_cached.db
# Or manually download to cache first
from pathwaydb import download_to_cache, load_from_cache
download_to_cache(species='human') # Download once
db = load_from_cache(species='human') # Reuse in any project
Method 3: Auto-Detect Best Source
from pathwaydb import GO
# Automatically uses best available source:
# 1. Bundled package data (instant, if available)
# 2. User cache (~/.pathwaydb_cache/)
# 3. Download fresh (if nothing found)
go = GO.load(species='human')
Querying GO Annotations
# Query GO terms for a specific gene
annotations = go.query_by_gene('BRCA1')
print(f"BRCA1 has {len(annotations)} GO annotations")
# Output: BRCA1 has 156 GO annotations
# Term names are already available!
for ann in annotations[:3]:
print(f" {ann.go_id}: {ann.term_name} [{ann.evidence_code}]")
# Output:
# GO:0006281: DNA repair [IBA]
# GO:0006355: regulation of transcription, DNA-templated [TAS]
# GO:0005515: protein binding [IPI]
# Filter by term name (case-insensitive substring match)
dna_repair = go.filter(term_name='DNA repair')
apoptosis = go.filter(term_name='apoptosis')
print(f"Found {len(dna_repair)} DNA repair annotations")
# Filter by namespace (biological_process, molecular_function, cellular_component)
bp_terms = go.filter(namespace='biological_process')
print(f"Biological Process annotations: {len(bp_terms)}")
# Filter by evidence codes (experimental evidence only)
exp_annotations = go.filter(evidence_codes=['EXP', 'IDA', 'IPI', 'IMP'])
print(f"Experimental evidence: {len(exp_annotations)}")
# Combine filters: TP53 + term name + experimental evidence
tp53_dna_exp = go.filter(
gene_symbols=['TP53'],
term_name='DNA',
evidence_codes=['EXP', 'IDA']
)
print(f"TP53 DNA-related (experimental): {len(tp53_dna_exp)}")
# Export to DataFrame format
df_data = go.to_dataframe()
# Returns: [{'GeneID': 'BRCA1', 'TERM': 'GO:0006281', 'Aspect': 'P', 'Evidence': 'IBA'}, ...]
Term Name Sources
GO term names are populated automatically from bundled package data (instant, no download needed!).
The package includes ~1.5 MB of pre-compiled GO term name mappings, so term names are available immediately after downloading annotations.
# Default behavior: uses bundled data (instant!)
go.download_annotations(species='human') # Term names included automatically
# Skip term names if you don't need them
go.download_annotations(species='human', fetch_term_names=False)
# Manually populate term names with different sources
go.populate_term_names(source='bundled') # Use bundled data only (default, instant)
go.populate_term_names(source='obo') # Download GO OBO file (~35MB)
go.populate_term_names(source='auto') # Try bundled > OBO > QuickGO API
go.populate_term_names(source='quickgo') # Use QuickGO API only (slow)
MSigDB Gene Sets
from pathwaydb import MSigDB
# Initialize MSigDB client
msigdb = MSigDB(storage_path='msigdb.db')
# Download specific collections
msigdb.download_collection('H') # Hallmark gene sets
msigdb.download_collection('C2') # Curated gene sets (KEGG, Reactome, etc.)
# NEW: Filter by gene set name (case-insensitive substring match)
apoptosis_sets = msigdb.filter(gene_set_name='apoptosis')
print(f"Found {len(apoptosis_sets)} apoptosis gene sets")
# Output: Found 15 apoptosis gene sets
# Filter by description
immune_sets = msigdb.filter(description='immune')
print(f"Found {len(immune_sets)} immune-related gene sets")
# Filter by collection
hallmark_sets = msigdb.filter(collection='H')
print(f"Found {len(hallmark_sets)} Hallmark gene sets")
# Query gene sets containing specific genes
tp53_sets = msigdb.filter(gene_symbols=['TP53'])
print(f"TP53 in {len(tp53_sets)} gene sets")
# Combine filters
hallmark_interferon = msigdb.filter(
collection='H',
gene_set_name='interferon'
)
print(f"Hallmark interferon sets: {len(hallmark_interferon)}")
# Export to DataFrame format
df_data = msigdb.to_dataframe(collection='H')
# Returns: [{'GeneID': 'TP53', 'GeneSet': 'HALLMARK_APOPTOSIS', 'Collection': 'H', 'Description': '...'}, ...]
Gene ID Conversion
from pathwaydb import IDConverter
# Initialize converter
converter = IDConverter(species='human')
# Convert single ID
symbol = converter.entrez_to_symbol('7157') # Returns 'TP53'
# Batch conversion
entrez_ids = ['7157', '675', '4609']
symbols = converter.batch_convert(entrez_ids, from_type='entrez', to_type='symbol')
# Multiple ID types supported
ensembl_id = converter.symbol_to_ensembl('TP53')
uniprot_id = converter.symbol_to_uniprot('TP53')
Database Information
KEGG (Kyoto Encyclopedia of Genes and Genomes)
- Coverage: 500+ organisms, 500+ pathways per species
- Content: Metabolic, signaling, disease pathways
- Update: Manually curated, regularly updated
- Species codes: 'hsa' (human), 'mmu' (mouse), 'rno' (rat), etc.
GO (Gene Ontology)
- Coverage: Thousands of species
- Content: Biological processes, molecular functions, cellular components
- Update: Continuously updated by consortium
- Hierarchy: DAG structure with parent-child relationships
MSigDB (Molecular Signatures Database)
- Collections:
H: Hallmark gene sets (50 sets)C1: Positional gene setsC2: Curated gene sets (KEGG, Reactome, BioCarta, etc.)C3: Regulatory target gene setsC4: Computational gene setsC5: Gene Ontology gene setsC6: Oncogenic signaturesC7: Immunologic signaturesC8: Cell type signatures
Advanced Usage
Working with Local Databases
from pathwaydb.storage import KEGGAnnotationDB
# Load existing database
db = KEGGAnnotationDB('kegg_human.db')
# Query with filters - search by pathway name (case-insensitive substring match)
results = db.filter(pathway_name='cancer')
print(f"Found {len(results)} annotations in cancer-related pathways")
# Output: Found 2389 annotations in cancer-related pathways
# Combine multiple filters
cancer_tp53 = db.filter(pathway_name='cancer', gene_symbols=['TP53'])
print(f"TP53 in {len(cancer_tp53)} cancer pathways")
# Output: TP53 in 15 cancer pathways
# Other filter options
metabolism = db.filter(pathway_name='metabolism')
specific_genes = db.filter(gene_symbols=['TP53', 'BRCA1', 'EGFR'])
specific_pathways = db.filter(pathway_ids=['hsa04110', 'hsa04115'])
# Export to different formats
records = db.to_records() # List of dicts
gene_sets = db.to_gene_sets() # For enrichment tools
# Database statistics
stats = db.stats()
print(f"Total annotations: {stats['total_annotations']}")
print(f"Unique pathways: {stats['unique_pathways']}")
print(f"Unique genes: {stats['unique_genes']}")
Centralized GO Caching (NEW in v0.2.0)
Download GO annotations once and reuse across all your projects:
from pathwaydb import GO
# Option 1: Load from cache (auto-downloads if missing)
go = GO.from_cache(species='human') # Uses ~/.pathwaydb_cache/go_annotations/
# Option 2: Smart load - auto-detects best source
# Tries: bundled package data > cache > download
go = GO.load(species='human')
# Option 3: Manually download to cache first
from pathwaydb.storage.go_db import download_to_cache
download_to_cache(species='human') # Download once
go = GO.from_cache(species='human') # Reuse in any project
See GO_CACHE_GUIDE.md for complete caching documentation.
Custom HTTP Caching
from pathwaydb import KEGG
# Use custom HTTP cache directory
kegg = KEGG(
species='hsa',
cache_dir='/path/to/custom/cache',
storage_path='kegg.db'
)
Batch Operations
from pathwaydb import KEGG
kegg = KEGG(species='hsa', storage_path='kegg.db')
# Download and convert IDs in one step
kegg.download_annotations()
kegg.convert_ids_to_symbols() # Convert Entrez IDs to gene symbols
# Query multiple genes
genes = ['TP53', 'BRCA1', 'EGFR']
for gene in genes:
pathways = kegg.query_by_gene(gene)
print(f"{gene}: {len(pathways)} pathways")
DataFrame Export for Enrichment Analysis
NEW FEATURE: Export annotations in tabular format compatible with pandas DataFrame and enrichment tools (similar to clusterProfiler in R).
Direct Export from Connectors
from pathwaydb import KEGG, GO
import pandas as pd
# KEGG - Export to DataFrame format
kegg = KEGG(species='hsa', storage_path='kegg_human.db')
df_data = kegg.to_dataframe() # Get all annotations
# Convert to pandas DataFrame
df = pd.DataFrame(df_data)
print(df.head())
Output:
GeneID PATH Annot
0 A2M hsa04610 Complement and coagulation cascades
1 NAT1 hsa00232 Caffeine metabolism
2 NAT1 hsa00983 Drug metabolism - other enzymes
3 NAT1 hsa01100 Metabolic pathways
4 NAT2 hsa00232 Caffeine metabolism
DataFrame Format Specifications
KEGG DataFrame columns:
GeneID: Gene symbol (e.g., 'TP53')PATH: Pathway ID (e.g., 'hsa04110')Annot: Pathway name/description
GO DataFrame columns:
GeneID: Gene symbol (e.g., 'BRCA1')TERM: GO term ID (e.g., 'GO:0006281')Aspect: P (biological_process), F (molecular_function), C (cellular_component)Evidence: Evidence code (e.g., 'EXP', 'IDA', 'IEA')
MSigDB DataFrame columns:
GeneID: Gene symbol (e.g., 'TP53')GeneSet: Gene set name (e.g., 'HALLMARK_APOPTOSIS')Collection: Collection code (e.g., 'H', 'C2')Description: Gene set description
Analysis Examples with pandas
# Get KEGG annotations
kegg = KEGG(species='hsa', storage_path='kegg_human.db')
df = pd.DataFrame(kegg.to_dataframe())
# Save to CSV
df.to_csv('kegg_annotations.csv', index=False)
# Filter for specific gene
tp53_pathways = df[df['GeneID'] == 'TP53']
print(f"TP53 pathways: {len(tp53_pathways)}")
# Find all genes in cancer-related pathways
cancer_df = df[df['Annot'].str.contains('cancer', case=False)]
cancer_genes = cancer_df['GeneID'].unique()
print(f"Genes in cancer pathways: {len(cancer_genes)}")
# Get pathway sizes
pathway_sizes = df.groupby('PATH')['GeneID'].count()
print(pathway_sizes.head())
# GO annotations
go = GO(storage_path='go_human.db')
df_go = pd.DataFrame(go.to_dataframe())
# Filter biological processes only
bp_df = df_go[df_go['Aspect'] == 'P']
# Get genes with experimental evidence
exp_df = df_go[df_go['Evidence'].isin(['EXP', 'IDA', 'IPI', 'IMP'])]
print(f"Annotations with experimental evidence: {len(exp_df)}")
# Create gene-to-term mapping
gene_to_terms = df_go.groupby('GeneID')['TERM'].apply(list).to_dict()
# MSigDB gene sets
msigdb = MSigDB(storage_path='msigdb.db')
df_msigdb = pd.DataFrame(msigdb.to_dataframe(collection='H'))
# Find genes in specific gene sets
apoptosis_genes = df_msigdb[df_msigdb['GeneSet'].str.contains('APOPTOSIS', case=False)]
print(f"Genes in apoptosis gene sets: {len(apoptosis_genes)}")
# Get all gene sets for a specific gene
tp53_sets = df_msigdb[df_msigdb['GeneID'] == 'TP53']['GeneSet'].unique()
print(f"TP53 is in {len(tp53_sets)} gene sets")
Use with Enrichment Analysis Tools
# Prepare background gene set
all_genes = df['GeneID'].unique()
# Prepare pathway gene sets for enrichment
pathway_dict = df.groupby('PATH').apply(
lambda x: {
'genes': x['GeneID'].tolist(),
'name': x['Annot'].iloc[0]
}
).to_dict()
# Your gene list of interest
my_genes = ['TP53', 'BRCA1', 'EGFR', 'MYC', 'KRAS']
# Find enriched pathways (simple overlap example)
for pathway_id, info in pathway_dict.items():
overlap = set(my_genes) & set(info['genes'])
if overlap:
print(f"{pathway_id}: {info['name']} - {len(overlap)} genes")
Architecture
PathwayDB follows a clean 3-layer architecture:
- Connectors Layer (
pathwaydb/connectors/): API clients for external databases - Storage Layer (
pathwaydb/storage/): SQLite-backed local storage with query interfaces - HTTP Layer (
pathwaydb/http/): Centralized HTTP client with caching and rate limiting
Key Design Principles
- No external dependencies: Easier deployment, fewer conflicts
- Caching by default: Respectful of API servers, faster repeat queries
- Separation of concerns: Connectors and storage are independent
- Extensible: Easy to add new databases following existing patterns
Performance
- Initial download: 1-5 minutes depending on database size
- Subsequent queries: Milliseconds (SQLite local queries)
- Memory footprint: Low (streaming downloads, efficient storage)
- Storage size:
- KEGG (human): ~8 MB
- MSigDB (all collections): ~77 MB
- GO (human): ~50 MB
Species Support
KEGG
Use organism codes: hsa (human), mmu (mouse), rno (rat), dme (fly), cel (worm), sce (yeast), etc.
GO (Gene Ontology)
Supported model organisms:
| Category | Species | Name |
|---|---|---|
| Mammals | human |
Homo sapiens |
mouse |
Mus musculus | |
rat |
Rattus norvegicus | |
pig |
Sus scrofa | |
cow |
Bos taurus | |
dog |
Canis familiaris | |
chicken |
Gallus gallus | |
| Fish | zebrafish |
Danio rerio |
| Invertebrates | fly |
Drosophila melanogaster |
worm |
Caenorhabditis elegans | |
| Plants | arabidopsis |
Arabidopsis thaliana |
| Fungi | yeast |
Saccharomyces cerevisiae |
from pathwaydb import get_supported_species
# List all supported species
print(get_supported_species())
# ['arabidopsis', 'chicken', 'cow', 'dog', 'fly', 'human', 'mouse', 'pig', 'rat', 'worm', 'yeast', 'zebrafish']
# Download for any supported species
go = GO(storage_path='go_fly.db')
go.download_annotations(species='fly')
MSigDB
Use common names: human, mouse.
Development
Running Tests
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# With coverage
pytest --cov=pathwaydb tests/
Code Formatting
# Format with black
black pathwaydb/
# Lint with flake8
flake8 pathwaydb/
# Type checking
mypy pathwaydb/
Documentation
Guides
Feature Guides:
- DATABASE_FILTERING_GUIDE.md - Complete filtering guide for all databases
- GO_TERM_NAME_GUIDE.md - GO term name filtering
- GO_CACHE_GUIDE.md - Centralized caching system
- GO_TERM_NAMES_PACKAGING.md - How bundled term names work
Developer Guides:
- CLAUDE.md - Architecture and development guidelines
- PACKAGING_GUIDE.md - Building and packaging instructions
- CONTRIBUTING.md - Contribution guidelines
API Reference
Main Classes:
KEGG(species, storage_path, cache_dir)- KEGG pathway database clientGO(storage_path, cache_dir)- Gene Ontology clientGO.from_cache(species)- Load from centralized cacheGO.load(species)- Auto-detect best source (bundled > cache > download)
MSigDB(storage_path, cache_dir)- MSigDB gene sets clientIDConverter(species, cache_path)- Gene ID converter
Key Methods:
download_annotations()- Download and store annotations (auto-populates term names for GO)query_by_gene(gene)- Query annotations for a specific geneto_dataframe(limit)- Export to pandas-compatible formatfilter(**criteria)- Filter annotations by various criteria- KEGG:
pathway_name,gene_symbols,pathway_ids,organism - GO:
term_name,gene_symbols,go_ids,namespace,evidence_codes - MSigDB:
gene_set_name,description,gene_symbols,collection
- KEGG:
stats()- Get database statisticspopulate_term_names()- Manually populate GO term names (uses bundled data)
Storage Classes:
KEGGAnnotationDB(db_path)- Direct access to KEGG storageGOAnnotationDB(db_path)- Direct access to GO storage
Package Data Functions:
load_go_term_names()- Load bundled GO term name mappingdownload_to_cache(species)- Download GO annotations to centralized cacheload_from_cache(species)- Load GO annotations from cache
For detailed architecture and development guidelines, see CLAUDE.md.
Examples
See the examples/ directory for comprehensive usage examples:
examples/quickstart.py- Basic usage for all databasesexamples/dataframe_export.py- DataFrame export and analysisexamples/go_filter_examples.py- GO filtering examplestest_go_cache.py- Centralized caching examplestest_msigdb_filter.py- MSigDB filtering examples
Contributing
Contributions are welcome! Here are some ways to contribute:
- Add new database connectors (WikiPathways, STRING, DisGeNET, etc.)
- Improve documentation
- Add tests
- Report bugs
- Suggest features
See CONTRIBUTING.md for contribution guidelines and CLAUDE.md for detailed development guidelines.
License
MIT License - see LICENSE file for details
Citation
If you use PathwayDB in your research, please cite:
@software{pathwaydb,
title = {PathwayDB: A Lightweight Pathway Annotation Toolkit},
author = {Guo, Kai},
year = {2026},
url = {https://github.com/guokai8/pathwaydb}
}
Acknowledgments
- KEGG: Kanehisa, M. et al. (2023) KEGG for taxonomy-based analysis
- GO: Gene Ontology Consortium (2023) The Gene Ontology knowledgebase
- MSigDB: Liberzon, A. et al. (2015) The Molecular Signatures Database
- MyGene.info: Used for gene ID conversion
Support
- Issues: GitHub Issues
- Documentation: CLAUDE.md
- Email: guokai8@gmail.com
Roadmap
Version 0.2.0 (Released):
- ✅ Description-based filtering for KEGG, GO, and MSigDB
- ✅ Bundled GO term name mapping (~1.5 MB)
- ✅ Automatic term name population
- ✅ Centralized caching system
- ✅ Enhanced DataFrame export for all databases
- ✅ Unified filtering API across databases
Version 0.3.0 (Planned):
- WikiPathways connector
- Batch download utilities
- Comprehensive test suite
- Performance optimizations
Future Considerations (based on user feedback):
- STRING protein-protein interactions
- DisGeNET disease-gene associations
- Human Phenotype Ontology (HPO)
- Integration helpers for GSEA/enrichR
- REST API server mode
- Command-line interface (CLI)
Want to contribute? See CONTRIBUTING.md for how to add new database connectors!
Related Projects
- mygene - Gene annotation queries
- bioservices - Comprehensive bio web services
- gprofiler - Functional enrichment analysis
- gseapy - GSEA in Python
Made with ❤️ for the bioinformatics community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pathwaydb-0.2.0.tar.gz.
File metadata
- Download URL: pathwaydb-0.2.0.tar.gz
- Upload date:
- Size: 455.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34dd9d90a7b71890040a3101b2c60feb2b7ce9f2bde0888bb87ac61bb5466a3c
|
|
| MD5 |
164af760646c2b6fbdd4c256f5c84e0c
|
|
| BLAKE2b-256 |
e79ef8e5c7c917410b896302076ddcfed73344f264fd9db45a6e96b3de615e4c
|
File details
Details for the file pathwaydb-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pathwaydb-0.2.0-py3-none-any.whl
- Upload date:
- Size: 450.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe496c6c1fda80d139f1262a946435def0cd54d289698d6e2c2c7069d1763d03
|
|
| MD5 |
04851b654f89e3f8f1611eb00f414026
|
|
| BLAKE2b-256 |
d266e18d2c9da5ad9e22c37f0bc1117113878bfba1346719da27a6096cd0f529
|