Skip to main content

A fast, scalable genome analysis system

Project description

ADAM is a library and command line tool that enables the use of Apache Spark to parallelize genomic data analysis across cluster/cloud computing environments. ADAM uses a set of schemas to describe genomic sequences, reads, variants/genotypes, and features, and can be used with data in legacy genomic file formats such as SAM/BAM/CRAM, BED/GFF3/GTF, and VCF, as well as data stored in the columnar Apache Parquet format. On a single node, ADAM provides competitive performance to optimized multi-threaded tools, while enabling scale out to clusters with more than a thousand cores. ADAM’s APIs can be used from Scala, Java, Python, R, and SQL.

Documentation

ADAM’s documentation is hosted at readthedocs.

Python Requirements

ADAM depends on having PySpark installed.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for bdgenomics.adam, version 0.28.0
Filename, size & hash File type Python version Upload date
bdgenomics.adam-0.28.0-py2.7.egg (39.3 MB) View hashes Egg 2.7
bdgenomics.adam-0.28.0.tar.gz (39.2 MB) View hashes Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page