Skip to main content

Pythonic API for R/Bioconductor statistical methods — calls validated R code, returns pandas DataFrames.

Project description

🪨 rosetta

Python interface to R/Bioconductor — pandas in, pandas out, .report() when you're done.

PyPI License: MIT Tests

pip install rosetta-bioc

30-second demo

import rosetta as rb

# DESeq2 differential expression — one call, pandas out
results = rb.deseq2(counts_df, metadata_df, design="~ condition")
results.report()
DESeq2 Results Summary
──────────────────────────────
Total genes tested:      12,000
Significant (padj<0.05): 843 (7.0%)
  ↑ Upregulated:         428
  ↓ Downregulated:       415
LFC range:               [-4.71, 3.50]

That's it. No R code. No rpy2 boilerplate. No type conversion. Just results.

What it wraps

R Package Python What it does
DESeq2 rb.deseq2() Differential expression (negative binomial)
edgeR rb.edger() Quasi-likelihood differential expression
limma rb.limma_voom() Linear models + TREAT significance
clusterProfiler rb.enrich_go() GO/KEGG/Reactome pathway enrichment
phyloseq rb.phyloseq() Microbiome diversity analysis
Seurat rb.seurat() Single-cell RNA-seq

All functions return a RosettaDataFrame (pandas DataFrame subclass) with a .report() method.

Modular DESeq2 API

For more control, use the step-by-step interface:

from rosetta.wrappers.deseq2 import run_deseq2, get_results, lfc_shrink

dds = run_deseq2(counts_df, metadata_df, design="~ condition")
res = get_results(dds, contrast=["condition", "treated", "control"], alpha=0.05)
shrunk = lfc_shrink(dds, coef="condition_treated_vs_control", type="apeglm")

res.report()
shrunk.report()

Enrichment analysis

import rosetta as rb

# Over-representation analysis
go_results = rb.enrich_go(gene_list, org_db="org.Hs.eg.db", ont="BP")
go_results.report()

# KEGG pathways
kegg = rb.enrich_kegg(gene_list, organism="hsa")
kegg.report()

Setup

Python side:

pip install rosetta-bioc

R side (one-time):

Rscript install.R

Or manually:

BiocManager::install(c("DESeq2", "edgeR", "limma", "clusterProfiler"))

Posit Cloud: See docs/posit-cloud.md for zero-config setup.

Requirements

  • Python 3.9+
  • R 4.0+ with Bioconductor
  • rpy2 ≥ 3.5

Philosophy

  1. Rosetta calls R — it doesn't reimplement it. All statistics run in the original, validated R packages.
  2. Pandas in, pandas out. No R objects leak into your Python workflow.
  3. Fail early, fail clearly. Input validation happens in Python before crossing the R boundary.
  4. .report() everything. Results should be immediately interpretable without manual inspection.

Contributing

See CONTRIBUTING.md. Good first issues are labeled — start with Issue #1: report() enhancements.

Acknowledgments

Built on rpy2 and the extraordinary R/Bioconductor ecosystem. All credit for the statistical methods goes to the original R package authors.

GSoC 2026 · MIT License · Nodes Bio

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rosetta_bioc-0.1.1.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rosetta_bioc-0.1.1-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file rosetta_bioc-0.1.1.tar.gz.

File metadata

  • Download URL: rosetta_bioc-0.1.1.tar.gz
  • Upload date:
  • Size: 20.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for rosetta_bioc-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b1fe5519502d1eacc3ae56d546bc53dc63b422d07568ab8133e87c8082890a1b
MD5 04989a76d79791c0bd67bdd1baf36614
BLAKE2b-256 bb02d0b856355df56450ac0bcbbc538d04b899f57e15f01ff960b9e78194ea8e

See more details on using hashes here.

File details

Details for the file rosetta_bioc-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: rosetta_bioc-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for rosetta_bioc-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 055d96d0531c0f28f0c070f780263da87dff69cf12b6d18ecf07cbd809af204e
MD5 d05ab79f575cdd526e2ad48c154cbb5d
BLAKE2b-256 b2d453aaeee2bb9b01eeb8cd3cab66c5a5e4857622189e1876a01da920337865

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page