Mitochondrial ribosome profiling analysis pipeline.
Project description
MitoRiboPy
MitoRiboPy is a package for mitochondrial ribosome profiling analysis. It runs a package-native pipeline from BED inputs through offset selection, translation-profile analysis, codon usage, coverage-profile plotting, and optional downstream modules such as structure-density export, codon correlation, and RNA-seq integration.
Highlights
- Standalone package CLI with no runtime dependency on legacy pipeline scripts
- Built-in human and yeast reference data loaded from packaged CSV and JSON files
- End-specific offset selection with separate 5' and 3' bounds
- P-site and A-site workflows with explicit offset-picking behavior
- In-memory BED filtering with no duplicated filtered BED output files
- Persistent per-run logging in
<output>/mitoribopy.log - Custom organism support through user-supplied annotation CSV and codon-table JSON files
- Bicistronic transcript handling for ATP8/ATP6 and ND4L/ND4 with configurable baseline sequence IDs
Installation
From the repository root:
python -m pip install -e .
For development and tests:
python -m pip install -e ".[dev]"
Then confirm the CLI is available:
mitoribopy --help
If you prefer not to install the package yet:
PYTHONPATH=src python -m mitoribopy --help
Quick Start
Minimal example:
mitoribopy \
-s h \
-f <reference.fa> \
--directory <ribo_bed_dir> \
-rpf 29 34 \
--output <results_dir>
Compatibility wrapper:
python main.py --help
Built-In References
MitoRiboPy ships with packaged reference data for:
- Human mitochondrial translation using the
vertebrate_mitochondrialcodon table - Yeast mitochondrial translation using the
yeast_mitochondrialcodon table
Built-in annotation tables are stored as CSV and built-in codon tables are stored as JSON under src/mitoribopy/data.
For bicistronic transcript regions:
- Titles stay consistent as
ATP8/ATP6andND4L/ND4 - The default sequence baselines are
ATP6andND4 - You can switch them with
--atp8_atp6_baseline ATP8|ATP6and--nd4l_nd4_baseline ND4L|ND4
Legacy FASTA/BED identifiers such as ATP86 and ND4L4 are still recognized through built-in aliases.
Custom Organisms
Custom organisms are supported through:
--annotation_file--codon_tables_file--codon_table_name--start_codons
For --strain custom, provide an explicit RPF range as well:
mitoribopy \
-s custom \
-f <reference.fa> \
--directory <ribo_bed_dir> \
-rpf 28 34 \
--annotation_file examples/custom_reference/annotation_template.csv \
--codon_tables_file examples/custom_reference/codon_tables_template.json \
--codon_table_name custom_example \
--start_codons ATG GTG \
--output <results_dir>
Example templates are included here:
- examples/custom_reference/annotation_template.csv
- examples/custom_reference/codon_tables_template.json
- examples/custom_reference/README.md
CLI Parameters
Required parameters
-f, --fasta: reference FASTA
Usually required for a normal run
These are not all technically mandatory in the parser, but they are the recommended minimum for a reproducible run:
-s, --strain--directory-rpf <min> <max>--output
Additional required parameters for --strain custom
--annotation_file--codon_tables_fileor--codon_table_name-rpf <min> <max>
Common optional parameters
--align start|stop--offset_type 5|3--offset_site p|a--offset_pick_reference p_site|selected_site--min_5_offset,--max_5_offset--min_3_offset,--max_3_offset--offset_mask_nt--read_counts_file--read_counts_sample_col--read_counts_reference_col--read_counts_reads_col--rpm_norm_mode total|mt_mrna--plot_format png|pdf|svg-m, --merge_density--structure_density--cor_plot--use_rna_seq
Example Usage
Human or yeast with default-style analysis
mitoribopy \
-s h \
-f <reference.fa> \
--directory <ribo_bed_dir> \
-rpf 29 34 \
--align stop \
--offset_type 5 \
--offset_site p \
--offset_pick_reference p_site \
--offset_mask_nt 5 \
--min_5_offset 10 \
--max_5_offset 22 \
--min_3_offset 10 \
--max_3_offset 22 \
--plot_format svg \
--output <results_dir> \
-m
Run with read-count normalization
mitoribopy \
-s h \
-f <reference.fa> \
--directory <ribo_bed_dir> \
-rpf 29 34 \
--read_counts_file <read_counts.csv> \
--read_counts_sample_col sample \
--read_counts_reads_col reads \
--read_counts_reference_col reference \
--rpm_norm_mode mt_mrna \
--mrna_ref_patterns mt_genome \
--output <results_dir>
Run optional downstream modules
mitoribopy \
-s h \
-f <reference.fa> \
--directory <ribo_bed_dir> \
-rpf 29 34 \
--structure_density \
--cor_plot \
--base_sample <sample_name> \
--output <results_dir>
Run a custom organism
mitoribopy \
-s custom \
-f <reference.fa> \
--directory <ribo_bed_dir> \
-rpf 28 34 \
--annotation_file <annotation.csv> \
--codon_tables_file <codon_tables.json> \
--codon_table_name <table_name> \
--start_codons ATG GTG \
--output <results_dir>
Input Files
BED
Expected columns:
chromstartend
Additional BED columns are tolerated. Coordinates are treated as standard 0-based, end-exclusive intervals.
FASTA
FASTA headers should match the annotation sequence_name or one of its sequence_aliases.
Annotation CSV
Required columns:
transcriptl_trl_utr5l_utr3
Optional columns:
l_cdssequence_namesequence_aliasesdisplay_name
Meaning:
transcriptis the logical CDS name used in frame and codon outputssequence_nameis the FASTA/BED sequence ID that the row maps ontosequence_aliasescontains alternate FASTA/BED names separated by semicolonsdisplay_namecontrols plot titles and grouped transcript labels
If l_cds is omitted, it is computed as l_tr - l_utr5 - l_utr3.
Codon-Table JSON
Two formats are supported:
- One flat 64-codon mapping
- A dictionary of named 64-codon mappings
When multiple named tables are present, choose one with --codon_table_name.
Read-Count Table
.csv, .tsv, and .txt are supported. Column matching is flexible and case-insensitive, with fallback to positional matching:
- first column: sample
- second column: reference
- third column: read count
Output Overview
Typical output structure:
<output>/
mitoribopy.log
plots_and_csv/
<sample>/
footprint_density/
translating_frame/
codon_usage/
debug_csv/
coverage_profile_plots/
structure_density/ # if --structure_density
codon_correlation/ # if --cor_plot
rna_seq_results/ # if --use_rna_seq
Key outputs include:
- offset enrichment CSVs and plots
- selected offset tables by read length
- footprint-density CSVs for P-site, A-site, and E-site
- frame-usage summaries
- transcript-level and total codon-usage summaries
- RPM and raw coverage-profile plots
- optional structure-density exports from footprint-density tables
Important Runtime Notes
--offset_type 5|3: downstream site placement from the read 5' or 3' end--offset_site p|a: whether reported offsets represent P-site or A-site positions--offset_pick_reference p_site|selected_site: how the best offset is chosen--min_5_offset,--max_5_offset,--min_3_offset,--max_3_offset: recommended end-specific selection bounds--offset_mask_nt: mask near-anchor bins from enrichment summaries and plots--rpm_norm_mode total|mt_mrna: read-count normalization mode--structure_density: export log2 and scaled density values from footprint-density tables
For the full interface, run:
mitoribopy --help
Development
Run the test suite with:
PYTHONPATH=src pytest
This repository also includes package migration notes and release materials under docs/README.md.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mitoribopy-0.2.0.tar.gz.
File metadata
- Download URL: mitoribopy-0.2.0.tar.gz
- Upload date:
- Size: 59.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3c7635e25fcb1943f451632b0416e1baadca7dbe4e85af26d72c8ddb8f4fd7c
|
|
| MD5 |
2118705bca095b9ae09c751faebe7590
|
|
| BLAKE2b-256 |
5614f974de5e526f6975c66597dfc2ab575ab129cf07f66cf5ac745ea90272a3
|
File details
Details for the file mitoribopy-0.2.0-py3-none-any.whl.
File metadata
- Download URL: mitoribopy-0.2.0-py3-none-any.whl
- Upload date:
- Size: 66.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9defa82ca0fcd4a29c67d1d628dbe699a16baa5ad016e06504bc7744baeb1033
|
|
| MD5 |
05da355328935d92b522027b892db4f5
|
|
| BLAKE2b-256 |
293b894a9c462087b0c86d29aae14065b1f20af99ec4c7a005d75ce3391d8c89
|