Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

wrappers for docker+python+R+bioconductor

Project description

mbf_anysnake

Documentation https://mbf_anysnake.readthedocs.io/en/latest/

Version controlled, per project Python, Python packages, R, Bioconductor and CRAN, all inside a Docker container.

Python:

  • any version of Python
  • versioned package installation from pypi
  • editable installation of libraries under development (git/mercurial)
  • two virtual environments at once, a local one and a global one for development tools

R (optional):

  • any version of R
  • any bioconductor version (R version matched to bioconductor)
  • bioconductor version oriented CRAN version control
  • all of CRAN

Rpy2

  • always matching the Python and R version used

All this from simple, toml based configuration file.

This solves the problem of providing a reproducible yet flexible environment for bioinformatic and data science projects.

Motivations in brief:

  • any python -> independent of os packaging -> quick updates solved via python-build (pyenv)
  • Bioconductor -> R version -> screen scrapping -> matching rpy2 to python versions -> docker volume mounts instead of inside-docker install
  • relocatable projects -> docker volume mounts abstract paths, allow relocatable venvs
  • dual venv - install development tools system wide, so you have always the latest jupyter without having to rebuild docker containers (fragile)
  • bioconductor dependes on CRAN packages of undefined version -> use microsoft cran snapshot mirror and packages as of release date of bioconductor (that's an approximation that may not always hold up in practice. Sometimes bioconductor uses package verisons that were not released when the bioconductor release happen. But it's a useful approximation for automatisation)
  • some bioconductor packages require annotation/experimental data: filter these packages by default, but allow user to request them (this is a space saving measure. Full bioconductor software + full cran: 50 gb, Bioconductor minus annotation: 21 GB, Bioconductor minus and minimal cran: 5GB),
  • cran package annotation is somewhat messy (duplicate entries, non existing versions, missing dependencies) -> hotfix.
  • install as much of cran/bioconductor as possible: Don't fight with the stuff in your daily workflow -> ton of apt-packages in the docker, see _inside_dockfill_bioconductor.py for list of things that are filted
  • R install.packages (and derivates) suffers from neglect: it can't cache the downloaded files, it ignores the dependencies when installing from files and whenever multi-cores are used. It is 2019, installing on a single core is unacceptable. We use a pypipegraph to work around this.

mbf_anysnake follows semver

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for mbf-anysnake, version 0.2
Filename, size File type Python version Upload date Hashes
Filename, size mbf_anysnake-0.2.tar.gz (59.0 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page