pybda

Analysis of big biological data sets for distributed HPC clusters.

These details have not been verified by PyPI

Project links

Homepage

Project description

A commandline tool for analysis of big biological data sets for distributed HPC clusters.

About

PyBDA is a Python library and command line tool for big data analytics and machine learning scaling to tera byte sized data sets.

In order to make PyBDA scale to big data sets, we use Apache Spark’s DataFrame API which, if developed against, automatically distributes data to the nodes of a high-performance cluster and does the computation of expensive machine learning tasks in parallel. For scheduling, PyBDA uses Snakemake to automatically execute pipelines of jobs. In particular, PyBDA will first build a DAG of methods/jobs you want to execute in succession (e.g. dimensionality reduction into clustering) and then compute every method by traversing the DAG. In the case of a successful computation of a job, PyBDA will write results and plots, and create statistics. If one of the jobs fails PyBDA will report where and which method failed (owing to Snakemake’s scheduling) such that the same pipeline can effortlessly be continued from where it failed the last time.

Documentation

Check out the documentation here. The documentation will walk you though

the installation process,
setting up Apache Spark,
using pybda.

Author

Simon Dirmeier simon.dirmeier at bsse.ethz.ch.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Aug 14, 2019

0.0.6

Mar 28, 2019

0.0.5

Feb 23, 2019

0.0.4

Feb 18, 2019

0.0.3

Feb 11, 2019

0.0.2

Feb 11, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybda-0.1.0.tar.gz (55.0 kB view details)

Uploaded Aug 14, 2019 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pybda-0.1.0-py2.py3-none-any.whl (127.2 kB view details)

Uploaded Aug 14, 2019 Python 2Python 3

File details

Details for the file pybda-0.1.0.tar.gz.

File metadata

Download URL: pybda-0.1.0.tar.gz
Upload date: Aug 14, 2019
Size: 55.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for pybda-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5549d969b88f31201f4e03e8cb5864f0f8c80442369b23241e72973956f09880`
MD5	`df5d0d81ba850b0d962e6e19c9a89113`
BLAKE2b-256	`80189ca71b566948e42e938a0d5ddf5d5cefc27c262ee7e94f161794db2d1eaa`

See more details on using hashes here.

File details

Details for the file pybda-0.1.0-py2.py3-none-any.whl.

File metadata

Download URL: pybda-0.1.0-py2.py3-none-any.whl
Upload date: Aug 14, 2019
Size: 127.2 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for pybda-0.1.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`de58bf9bbdbf452af77079a37385de7410a4b66f077c4df5f36e0190907419b9`
MD5	`b830f5e3b7968626d772429bd9c0b331`
BLAKE2b-256	`9b28c1d9647212d0cc0cc5272077d5456a6d95b52c8b159c18b5c633902cc285`

See more details on using hashes here.

pybda 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

About

Documentation

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes