Skip to main content

Pythonic API for R/Bioconductor statistical methods — calls validated R code, returns pandas DataFrames.

Project description

🪨 rosetta

Python interface to R/Bioconductor — pandas in, pandas out, .report() when you're done.

PyPI License: MIT Tests

pip install rosetta-bioc

30-second demo

import rosetta as rb

# DESeq2 differential expression — one call, pandas out
results = rb.deseq2(counts_df, metadata_df, design="~ condition")
results.report()
DESeq2 Results Summary
──────────────────────────────
Total genes tested:      12,000
Significant (padj<0.05): 843 (7.0%)
  ↑ Upregulated:         428
  ↓ Downregulated:       415
LFC range:               [-4.71, 3.50]

That's it. No R code. No rpy2 boilerplate. No type conversion. Just results.

What it wraps

R Package Python What it does
DESeq2 rb.deseq2() Differential expression (negative binomial)
edgeR rb.edger() Quasi-likelihood differential expression
limma rb.limma_voom() Linear models + TREAT significance
clusterProfiler rb.enrich_go() GO/KEGG/Reactome pathway enrichment
phyloseq rb.phyloseq() Microbiome diversity analysis
Seurat rb.seurat() Single-cell RNA-seq

All functions return a RosettaDataFrame (pandas DataFrame subclass) with a .report() method.

Modular DESeq2 API

For more control, use the step-by-step interface:

from rosetta.wrappers.deseq2 import run_deseq2, get_results, lfc_shrink

dds = run_deseq2(counts_df, metadata_df, design="~ condition")
res = get_results(dds, contrast=["condition", "treated", "control"], alpha=0.05)
shrunk = lfc_shrink(dds, coef="condition_treated_vs_control", type="apeglm")

res.report()
shrunk.report()

Enrichment analysis

import rosetta as rb

# Over-representation analysis
go_results = rb.enrich_go(gene_list, org_db="org.Hs.eg.db", ont="BP")
go_results.report()

# KEGG pathways
kegg = rb.enrich_kegg(gene_list, organism="hsa")
kegg.report()

Setup

Python side:

pip install rosetta-bioc

R side (one-time):

Rscript install.R

Or manually:

BiocManager::install(c("DESeq2", "edgeR", "limma", "clusterProfiler"))

Posit Cloud: See docs/posit-cloud.md for zero-config setup.

Requirements

  • Python 3.9+
  • R 4.0+ with Bioconductor
  • rpy2 ≥ 3.5

Philosophy

  1. Rosetta calls R — it doesn't reimplement it. All statistics run in the original, validated R packages.
  2. Pandas in, pandas out. No R objects leak into your Python workflow.
  3. Fail early, fail clearly. Input validation happens in Python before crossing the R boundary.
  4. .report() everything. Results should be immediately interpretable without manual inspection.

Contributing

See CONTRIBUTING.md. Good first issues are labeled — start with Issue #1: report() enhancements.

Acknowledgments

Built on rpy2 and the extraordinary R/Bioconductor ecosystem. All credit for the statistical methods goes to the original R package authors.

GSoC 2026 · MIT License · Nodes Bio

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rosetta_bioc-0.1.0.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rosetta_bioc-0.1.0-py3-none-any.whl (20.0 kB view details)

Uploaded Python 3

File details

Details for the file rosetta_bioc-0.1.0.tar.gz.

File metadata

  • Download URL: rosetta_bioc-0.1.0.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for rosetta_bioc-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a8ed2927f786b6092a15a38b60b0eb6ded1ca6c652577e125633950e975f9545
MD5 e40ce75cca7c88f0ab83f0763819dc98
BLAKE2b-256 784f42f3b4764a5cad28aa490057d75ae898e2467e017a45b6e96a0c752a6d8b

See more details on using hashes here.

File details

Details for the file rosetta_bioc-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rosetta_bioc-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for rosetta_bioc-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cbaa0bdd5254b969a31a9d25ae2c2a394987e4d829e40495339c7ca62232dd7c
MD5 01eb3986d0225c15059b46bb01ba4542
BLAKE2b-256 64e90a43fe61727cbb751bad5b7adfa408b50eb83424eebcbd2651164f66623f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page