A comprehensive tool for differential gene expression pathway enrichment analysis
Project description
DEG Pathway Enrichment Tool
A comprehensive Python package for differential gene expression (DEG) pathway enrichment analysis. This tool provides an easy-to-use interface for analyzing gene expression data and identifying enriched biological pathways.
Features
- 🧬 Comprehensive Pathway Analysis: Supports multiple databases (KEGG, GO Biological Process, Reactome, MSigDB Hallmark)
- 📊 Rich Visualizations: Static and interactive plots with customizable DPI
- 🔍 Gene Family Analysis: Specialized analysis for keratin, claudin, and other gene families
- 📈 Multiple Plot Types: Bar plots, dot plots, volcano plots, and comprehensive summaries
- 📝 Detailed Reports: Markdown reports with analysis summaries
- 🎯 Flexible Thresholds: Customizable log fold change and p-value cutoffs
- 💻 Command-line Interface: Easy-to-use CLI for batch processing
Installation
From PyPI (Recommended)
pip install deg-pathway-enrichment-tool
From Source
git clone https://github.com/yourusername/deg-pathway-enrichment-tool.git
cd deg-pathway-enrichment-tool
pip install -e .
Quick Start
Command Line Usage
# Basic analysis
deg-pathway-analysis your_deg_data.csv
# Custom output directory and thresholds
deg-pathway-analysis your_deg_data.csv -o results/ --logfc-threshold 2.0 --pval-threshold 0.001
# High-resolution figures
deg-pathway-analysis your_deg_data.csv --dpi 1200
Python API Usage
from deg_pathway_enrichment_tool import DEGPathwayAnalyzer
# Initialize analyzer
analyzer = DEGPathwayAnalyzer(
deg_file="your_deg_data.csv",
output_dir="./results",
logfc_threshold=1.5,
pval_threshold=0.01,
dpi=600
)
# Run complete analysis
analyzer.run_complete_analysis()
Input Data Format
Your CSV file must contain the following columns:
| Column Name | Description | Example |
|---|---|---|
names |
Gene names/symbols | GAPDH, TP53, MYC |
logfoldchanges |
Log fold change values | 2.5, -1.8, 3.2 |
pvals_adj |
Adjusted p-values | 0.001, 0.05, 1e-10 |
Example CSV format:
names,logfoldchanges,pvals_adj
KRT7,8.72,1e-300
CLDN10,8.52,3.35e-197
TP53,-2.1,0.001
GAPDH,1.2,0.05
Output Files
The tool generates comprehensive results including:
Visualizations
pathway_barplot.png- Bar plot of top enriched pathwayspathway_dotplot.png- Dot plot showing pathway significance vs effect sizeinteractive_pathway_barplot.html- Interactive pathway visualizationkeratin_expression.png- Keratin gene family analysisclaudin_expression.png- Claudin gene family analysiscomprehensive_summary.png- Multi-panel summary figure
Data Files
pathway_enrichment_results.csv- Complete pathway enrichment resultskeratin_genes.csv- Keratin gene analysis resultsclaudin_genes.csv- Claudin gene analysis resultsanalysis_report.md- Comprehensive analysis report
Command Line Options
deg-pathway-analysis --help
| Option | Description | Default |
|---|---|---|
input_file |
Path to CSV file containing DEG results | Required |
-o, --output-dir |
Output directory for results | ./deg_analysis_results |
--logfc-threshold |
Log fold change threshold for significance | 1.5 |
--pval-threshold |
Adjusted p-value threshold for significance | 0.01 |
--dpi |
DPI for saved figures | 600 |
--databases |
Pathway databases to use | KEGG, GO-BP, Reactome, MSigDB |
Supported Pathway Databases
- KEGG_2021_Human: KEGG pathway database
- GO_Biological_Process_2021: Gene Ontology Biological Process
- Reactome_2022: Reactome pathway database
- MSigDB_Hallmark_2020: MSigDB Hallmark gene sets
Advanced Usage
Custom Database Selection
deg-pathway-analysis input.csv --databases KEGG_2021_Human GO_Biological_Process_2021
Python API - Step by Step
from deg_pathway_enrichment_tool import DEGPathwayAnalyzer
# Initialize
analyzer = DEGPathwayAnalyzer("data.csv", output_dir="results")
# Run individual steps
pathway_results = analyzer.run_pathway_enrichment()
family_results = analyzer.analyze_gene_families()
analyzer.create_pathway_plots()
analyzer.create_comprehensive_summary()
analyzer.generate_report()
Requirements
- Python ≥ 3.8
- pandas ≥ 2.0.0
- numpy ≥ 1.24.0
- matplotlib ≥ 3.7.0
- seaborn ≥ 0.12.0
- plotly ≥ 5.15.0
- gseapy ≥ 1.0.4
- scipy ≥ 1.10.0
Publishing to PyPI
1. Prepare Your Package
Ensure your package structure is correct and all files are in place.
2. Build the Package
cd deg-pathway-enrichment-tool
pip install build twine
python -m build
3. Upload to PyPI
# Test PyPI first (recommended)
python -m twine upload --repository testpypi dist/*
# Production PyPI
python -m twine upload dist/*
4. GitHub Integration
- Create a GitHub repository
- Push your code:
git init
git add .
git commit -m "Initial commit"
git branch -M main
git remote add origin https://github.com/yourusername/deg-pathway-enrichment-tool.git
git push -u origin main
- Set up GitHub Actions for automated PyPI publishing (optional):
- Create
.github/workflows/publish.yml - Add PyPI API token to GitHub secrets
- Create
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Citation
If you use this tool in your research, please cite:
DEG Pathway Enrichment Tool (2025).
Available at: https://github.com/yourusername/deg-pathway-enrichment-tool
Support
- 📖 Documentation: GitHub Wiki
- 🐛 Bug Reports: GitHub Issues
- 💬 Discussions: GitHub Discussions
Changelog
v1.0.0 (2025-07-30)
- Initial release
- Comprehensive pathway enrichment analysis
- Multiple visualization options
- Gene family analysis
- Command-line interface
- Python API
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deg_pathway_enrichment_tool-1.0.0.tar.gz.
File metadata
- Download URL: deg_pathway_enrichment_tool-1.0.0.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e06faa8835dbe5c400aa7e2b372ffc9af7aceeca3bc571c8f21d7f40ee03c8b
|
|
| MD5 |
acc23d22e4c53250febd2046a3a02c16
|
|
| BLAKE2b-256 |
223051511681edc79583238c2bf0bcd7e674b7fc0141a22e92d5b1889277322e
|
File details
Details for the file deg_pathway_enrichment_tool-1.0.0-py3-none-any.whl.
File metadata
- Download URL: deg_pathway_enrichment_tool-1.0.0-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f2b3fb2830d21df23181b33749dfbc5943aaee545f5f81a4f7f40de97f28a0c
|
|
| MD5 |
e97dca568efd83874bbe99c6453e20ed
|
|
| BLAKE2b-256 |
5a66674f12f19d2c5df58a76b04000bd148f74c5d1eb26d06e11debe5088cdc4
|