Immuno-Oncology Biological Research tools in Python
Project description
IOBRpy
IOBRpy is a command-line toolkit for bulk RNA-seq tumor microenvironment (TME) analysis. It wires together FASTQ QC, quantification (Salmon or STAR), matrix assembly, signature scoring, immune deconvolution, clustering, and ligand–receptor scoring.
Documentation
A complete documentation for IOBRpy can be found at https://iobr.github.io/IOBRpy/.
Agent Bootstrap
If you want coding agents to discover iobrpy-cli without repeating it in every prompt, install the packaged agent integrations once:
# Codex: install the bundled global skill plus MCP registration
iobrpy-cli agent install --client codex
# Claude Code: install the managed global memory plus MCP registration
iobrpy-cli agent install --client claude-code
# Configure every supported client in one pass
iobrpy-cli agent install --client all
# Inspect what is already installed without changing anything
iobrpy-cli agent status
By default these commands print a short human-readable summary. Add --json when you want the full machine-readable payload for automation or another agent.
For path-driven agent work, start with:
iobrpy-cli map --path /path/to/data --json
This stage map tells the agent whether the directory is still raw FASTQ input, partially processed, or already ready for downstream TPM/TME analysis, so it can ask whether you want to continue, rerun the current stage, or rerun the full pipeline. The JSON output also includes a scenario card and roadmap position summary so an agent can explain, in plain language, what has already been done and what the next sensible choices are.
What gets installed:
- Codex: a global
iobrpy-fastpathskill under~/.codex/skills/plus an MCP server entry in~/.codex/config.toml. - Claude Code: a managed global memory import in
~/.claude/CLAUDE.md, the managed memory file at~/.claude/iobrpy/CLAUDE.md, and a user-scoped MCP server viaclaude mcp add ....
The registered server launches iobrpy-cli-mcp through the current Python environment, so agents can call native iobrpy workflows as tools instead of guessing from source files.
Installation
Quick install
# Method 1 : PyPI
pip install iobrpy
# Method 2 : Conda (bioconda via conda-forge + bioconda)
conda install -c conda-forge -c bioconda iobrpy=0.1.8
# Method 3 : Docker
docker pull hhn123123/iobrpy:latest
PyPI
Show full PyPI steps
# Creating a virtual environment is recommended
conda create -n iobrpy python=3.11 -y
conda activate iobrpy
# Update pip
python -m pip install --upgrade pip
# Install iobrpy
pip install iobrpy
# Install fastp, salmon, STAR and MultiQC
# Recommended: use mamba for faster solves (if available)
mamba install -y -c conda-forge -c bioconda \
fastp \
salmon \
star \
trust4
# If you don't have mamba, use conda instead
conda install -y -c conda-forge -c bioconda \
fastp \
salmon \
star \
trust4
Conda
Prerequisite (Conda): Please install Miniconda or Anaconda first. We recommend Miniconda.
Show full Conda steps
# Creating a virtual environment is recommended
conda create -n iobrpy python=3.11 -y
conda activate iobrpy
# Install iobrpy 0.1.8 (from bioconda via conda-forge + bioconda)
# Recommended: use mamba for faster solves (if available)
mamba install -y -c conda-forge -c bioconda iobrpy=0.1.8
# If you don't have mamba, use conda instead
conda install -y -c conda-forge -c bioconda iobrpy=0.1.8
Docker
Docker Hub website: Docker Hub
Show Docker pull
# Option 1: Pull the latest image from Docker Hub
docker pull hhn123123/iobrpy:latest
# Option 2: Offline install (from GitHub Release)
# 1) Download iobrpy.tar.gz from https://github.com/IOBR/IOBRpy/releases/tag/v1.0.0
# 2) Change to the directory where the archive is saved and load the image
cd /path/to/iobrpy.tar.gz
docker load -i iobrpy.tar.gz
Features
End-to-End Pipeline Runner
runall— A single command that wires the full Salmon or STAR pipeline end-to-end and writes the standardized layout: The pipeline creates the following directories, in order:01-qc/,02-salmon/or02-star/,03-tpm/,04-signatures/,05-tme/, and06-LR_cal/.
All-in-one TME profiling
tme_profile- A single command that inputs a TPM (genes×samples) matrix, performs signature scoring, runs six immune deconvolution methods, merges their outputs, and computes ligand–receptor scores, using the functionscalculate_sig_score,cibersort,IPS,estimate,mcpcounter,quantiseq,epic, andLR_cal.
Preprocessing
fastq_qc— Parallel FASTQ QC/trimming via fastp, with per-sample HTML/JSON and an optional MultiQC summary report under01-qc/multiqc_report/. Resume-friendly and prints output paths first.
Salmon submodule (quantification, merge, and TPM)
batch_salmon— Batch salmon quant on paired-end FASTQs; safe R1/R2 inference; per-samplequant.sf; progress and preflight checks (salmon version, index meta).merge_salmon— Recursively collect per-samplequant.sfand produce two matrices: TPM and NumReads.prepare_salmon— Clean up Salmon outputs into a TPM matrix; strip version suffixes; keepsymbol/ENSG/ENSTidentifiers.
STAR submodule (alignment, counts, and TPM)
batch_star_count— Batch STAR alignment with--quantMode GeneCounts, sorted BAM +_ReadsPerGene.out.tab; resume-friendly summary.merge_star_count— Merge multiple_ReadsPerGene.out.tabinto one wide count matrix.count2tpm— Convert counts to TPM (supports Ensembl/Entrez/Symbol/MGI; optional effective length CSV).
Expression Annotation & Mouse to Human Mapping & log2(x+1) (Optional)
anno_eset— Harmonize/annotate an expression matrix (choose symbol/probe columns; deduplicate; aggregation method).mouse2human_eset— Convert mouse gene symbols to human gene symbols. Supports two modes: matrix mode (rows = genes) or table mode (input contains a symbol column).log2_eset— Apply log2(x+1) to a genes × samples expression matrix.
Pathway / signature scoring
calculate_sig_score— Sample‑level signature scores viapca,zscore,ssgsea, orintegration. Supports the following signature groups (space‑ or comma‑separated), orallto merge them:go_bp,go_cc,go_mfsignature_collection,signature_tme,signature_sc,signature_tumor,signature_metabolismkegg,hallmark,reactome
Immune deconvolution and scoring
cibersort— CIBERSORT wrapper/implementation with permutations, quantile normalization, absolute mode.quantiseq— quanTIseq deconvolution withlseior robust norms (hampel,huber,bisquare); tumor‑gene filtering; mRNA scaling.epic— EPIC cell fractions usingTRef/BRefreferences.estimate— ESTIMATE immune/stromal/tumor purity scores.mcpcounter— MCPcounter infiltration scores.IPS— Immunophenoscore (AZ/SC/CP/EC + total).deside— Deep learning–based deconvolution (requires pre‑downloaded model; supports pathway‑masked mode via KEGG/Reactome GMTs).
Clustering / decomposition
tme_cluster— k‑means with automatic k via KL index (Hartigan–Wong), feature selection and standardization.nmf— NMF‑based clustering (auto‑selects k; excludes k=2) with PCA plot and top features.
Ligand–receptor
LR_cal— Ligand–receptor interaction scoring using cancer‑type specific networks.
Input Requirements
- FASTQ layout: paired-end by default. Filenames end with
*_1.fastq.gz/*_2.fastq.gz(configurable via--suffix1). - Expression matrix orientation: genes × samples by default.
- Output file delimiters: automatically inferred from the file extension; .csv and .tsv/.txt are recommended.
Command‑line usage
From FASTQ to TME - runall
How runall passes options
runall defines a small set of top-level options (e.g., --mode/--outdir/--fastq/--threads/--batch_size). Any unrecognized options are forwarded to the corresponding sub-steps. This keeps runall flexible as sub-commands evolve.
Below are two fully wired workflows handled by iobrpy runall.
Salmon mode
iobrpy runall \
--mode salmon \
--outdir "/path/to/outdir" \
--fastq "/path/to/fastq" \
--threads 8 \
--batch_size 1 \
--index "/path/to/salmon/index" \
--project MyProj
STAR mode
iobrpy runall \
--mode star \
--outdir "/path/to/outdir" \
--fastq "/path/to/fastq" \
--threads 8 \
--batch_size 1 \
--index "/path/to/star/index" \
--project MyProj
Option legend for the runall examples
Common options
| Flag | Purpose |
|---|---|
--mode {salmon / star} |
Select backend (Salmon quant vs. STAR align+count) |
--outdir <DIR> |
Root output directory (creates the standardized layout) |
--fastq <DIR> |
Raw FASTQ dir |
--index <DIR> |
Salmon : path to Salmon index; STAR : path to STAR index |
--project <STR> |
Prefix for merged outputs |
--threads <INT> / --batch_size <INT> |
Global concurrency/batching |
Expected layout
# Salmon mode:
/path/to/outdir
|-- 01-qc
| |-- <sample>_1.fastq.gz
| |-- <sample>_2.fastq.gz
| |-- <sample>_fastp.html
| |-- <sample>_fastp.json
| |-- <sample>.task.complete
| `-- multiqc_report
| `-- multiqc_fastp_report.html
|-- 02-salmon
| |-- <sample>
| | `-- quant.sf
| |-- MyProj_salmon_count.tsv.gz
| `-- MyProj_salmon_tpm.tsv.gz
|-- 03-tpm
| |-- prepare_salmon.csv
| `-- tpm_matrix.csv
|-- 04-signatures
| `-- calculate_sig_score.csv
|-- 05-tme
| |-- cibersort_results.csv
| |-- epic_results.csv
| |-- quantiseq_results.csv
| |-- IPS_results.csv
| |-- estimate_results.csv
| |-- mcpcounter_results.csv
| `-- deconvo_merged.csv
`-- 06-LR_cal
`-- lr_cal.csv
# STAR mode:
/path/to/outdir
|-- 01-qc
| |-- <sample>_1.fastq.gz
| |-- <sample>_2.fastq.gz
| |-- <sample>_fastp.html
| |-- <sample>_fastp.json
| |-- <sample>.task.complete
| `-- multiqc_report
| `-- multiqc_fastp_report.html
|-- 02-star
| |-- <sample>/
| |-- <sample>__STARgenome/
| |-- <sample>__STARpass1/
| |-- <sample>_STARtmp/
| |-- <sample>_Aligned.sortedByCoord.out.bam
| |-- <sample>_Log.final.out
| |-- <sample>_Log.out
| |-- <sample>_Log.progress.out
| |-- <sample>_ReadsPerGene.out.tab
| |-- <sample>_SJ.out.tab
| |-- <sample>.task.complete
| |-- .batch_star_count.done
| |-- .merge_star_count.done
| `-- MyProj.STAR.count.tsv.gz
|-- 03-tpm
| |-- count2tpm.csv
| `-- tpm_matrix.csv
|-- 04-signatures
| `-- calculate_sig_score.csv
|-- 05-tme
| |-- cibersort_results.csv
| |-- epic_results.csv
| |-- quantiseq_results.csv
| |-- IPS_results.csv
| |-- estimate_results.csv
| |-- mcpcounter_results.csv
| `-- deconvo_merged.csv
`-- 06-LR_cal
`-- lr_cal.csv
Output Reference
Standard layout (produced by iobrpy runall)
01-qc/— fastp outputs; a resume flag.fastq_qc.doneis written when the step completes.02-salmon/or02-star/— quantification/alignment + merged matrices; resume flags like.batch_salmon.done,.merge_salmon.done, or.merge_star_count.done.03-tpm/— unified TPM matrixtpm_matrix.csv. For Salmon mode it comes fromprepare_salmon; for STAR mode it comes fromcount2tpm.04-signatures/— signature scoring results (file:calculate_sig_score.csv).05-tme/— deconvolution outputs from multiple methods +deconvo_merged.csv.06-LR_cal/— ligand–receptor resultslr_cal.csv.
Salmon mode (02-salmon/)
- Per-sample Salmon folders containing
quant.sf(frombatch_salmon). A.batch_salmon.doneflag is written after completion. - Merged matrices (from
merge_salmon):<PROJECT>_salmon_tpm.tsv[.gz]<PROJECT>_salmon_count.tsv[.gz]
A.merge_salmon.doneflag is written after completion.
03-tpm/prepare_salmon.csv— cleaned genes × samples TPM matrix produced byprepare_salmon(default--return_feature symbolunless overridden).03-tpm/tpm_matrix.csv— log2(x+1) matrix produced bylog2_esetfromprepare_salmon.csv.
STAR mode (02-star/)
- Per-sample STAR outputs (BAM, logs,
*_ReadsPerGene.out.tab, etc.). - Merged counts (from
merge_star_count):<PROJECT>.STAR.count.tsv.gz. A.merge_star_count.doneflag is written after completion.
03-tpm/count2tpm.csv— TPM matrix produced bycount2tpmfrom the merged STAR ReadPerGene/count matrix.03-tpm/tpm_matrix.csv— log2(x+1) matrix produced bylog2_esetfromcount2tpm.csv.
Signatures (04-signatures/)
calculate_sig_score.csv— per-sample pathway/signature scores. Columns correspond to the selected signature set and method (integration,pca,zscore, orssgsea).
Deconvolution (05-tme/)
Each method writes a single table named <method>_results.csv:
cibersort_results.csv— columns suffixed with_CIBERSORT. Note whether--permand--QNwere used.quantiseq_results.csv— quanTIseq fractions. Document the chosen--method {lsei|hampel|huber|bisquare}and flags like--arrays,--tumor,--scale_mrna,--signame.epic_results.csv— EPIC fractions; record the reference profile used (--reference {TRef|BRef|both}).estimate_results.csv— ESTIMATE immune/stromal/purity scores; columns suffixed_estimate.mcpcounter_results.csv— MCPcounter scores; columns suffixed_MCPcounter.IPS_results.csv— IPS sub-scores and total score.
Merged table
deconvo_merged.csv— produced byrunallafter all deconvolution methods finish; normalizes the sample index to a column namedIDand outer-joins by sample ID across methods.
Ligand–receptor (06-LR_cal/)
lr_cal.csv— ligand–receptor scoring table fromLR_cal. Record the--data_type {count|tpm}and the--id_typeyou used.
Contact / Support
- Issues: https://github.com/IOBR/IOBRpy/issues
- Maintainers: [ Haonan Huang ] (email = 2905611068@qq.com); [ Dongqiang Zeng ] (email = interlaken@smu.edu.cn)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file iobrpy-0.1.8.tar.gz.
File metadata
- Download URL: iobrpy-0.1.8.tar.gz
- Upload date:
- Size: 52.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60046891042a276d7bc6426c608920db44be9e1dc1df8f5fd4f3f28faca51104
|
|
| MD5 |
62e9a1f7d5890d468df4778e97adfa23
|
|
| BLAKE2b-256 |
f386b02fb35bb804b2eb420d94e1f6059d914819bd1f70fd4c97191d89d72f46
|
File details
Details for the file iobrpy-0.1.8-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.
File metadata
- Download URL: iobrpy-0.1.8-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
- Upload date:
- Size: 58.1 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c6a80ea9dec0e88c61960cc4f9f8d39c5844a0e709976285f098ba86b1047c3
|
|
| MD5 |
28fadc925af9f96b4a8637048dc1814d
|
|
| BLAKE2b-256 |
a3bb5df939425ff55b938d392798307c644d50113e7bfdf19b955130a9f08177
|