Skip to main content

tools for comparing DNA sequences with MinHash sketches

Project description

sourmash

Documentation Build Status codecov DOI

Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.

Usage:

sourmash compute *.fq.gz
sourmash compare *.sig -o distances
sourmash plot distances

Sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027):.


The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Primary authors: C. Titus Brown (@ctb) and Luiz C. Irber, Jr (@luizirber).

sourmash is a product of the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine.

Installation

We recommend using bioconda to install sourmash:

conda install -c conda-forge -c bioconda sourmash

This will install the latest stable version of sourmash 2.

You can also use pip to install sourmash:

pip install sourmash

A quickstart tutorial is available.

Requirements

sourmash runs under both Python 2.7.x and Python 3.5+. The base requirements are screed and ijson, together with a C++ development environment and the CPython development headers and libraries (for the C++ extension).

The comparison code (sourmash compare) uses numpy, and the plotting code uses matplotlib and scipy, but most of the code is usable without these.

For search and gather you also need khmer version 2.1+.

Installation with conda

Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda you will need to add the bioconda channel as well as the other channels bioconda depends on. Once you have setup bioconda, you can install sourmash by running:

$ conda create -n sourmash_env -c conda-forge -c bioconda sourmash python=3.7
$ source activate sourmash_env
$ sourmash compute -h

which will install the latest alpha release.

Support

Please ask questions and files issues on Github.

Development

Development happens on github at dib-lab/sourmash.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the library code.

Tests require py.test and can be run with make test.

Please see the developer notes for more information.


CTB Dec 2018

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourmash-2.3.0.tar.gz (7.3 MB view hashes)

Uploaded Source

Built Distributions

sourmash-2.3.0-cp38-cp38-manylinux2010_x86_64.whl (847.4 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

sourmash-2.3.0-cp38-cp38-manylinux1_x86_64.whl (847.4 kB view hashes)

Uploaded CPython 3.8

sourmash-2.3.0-cp38-cp38-macosx_10_9_x86_64.whl (184.2 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

sourmash-2.3.0-cp37-cp37m-manylinux2010_x86_64.whl (786.1 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

sourmash-2.3.0-cp37-cp37m-manylinux1_x86_64.whl (786.1 kB view hashes)

Uploaded CPython 3.7m

sourmash-2.3.0-cp37-cp37m-macosx_10_6_intel.whl (183.2 kB view hashes)

Uploaded CPython 3.7m macOS 10.6+ intel

sourmash-2.3.0-cp36-cp36m-manylinux2010_x86_64.whl (788.3 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

sourmash-2.3.0-cp36-cp36m-manylinux1_x86_64.whl (788.3 kB view hashes)

Uploaded CPython 3.6m

sourmash-2.3.0-cp36-cp36m-macosx_10_6_intel.whl (187.6 kB view hashes)

Uploaded CPython 3.6m macOS 10.6+ intel

sourmash-2.3.0-cp35-cp35m-manylinux2010_x86_64.whl (779.7 kB view hashes)

Uploaded CPython 3.5m manylinux: glibc 2.12+ x86-64

sourmash-2.3.0-cp35-cp35m-manylinux1_x86_64.whl (779.7 kB view hashes)

Uploaded CPython 3.5m

sourmash-2.3.0-cp35-cp35m-macosx_10_6_intel.whl (183.7 kB view hashes)

Uploaded CPython 3.5m macOS 10.6+ intel

sourmash-2.3.0-cp27-cp27mu-manylinux2010_x86_64.whl (729.1 kB view hashes)

Uploaded CPython 2.7mu manylinux: glibc 2.12+ x86-64

sourmash-2.3.0-cp27-cp27mu-manylinux1_x86_64.whl (729.1 kB view hashes)

Uploaded CPython 2.7mu

sourmash-2.3.0-cp27-cp27m-manylinux2010_x86_64.whl (729.1 kB view hashes)

Uploaded CPython 2.7m manylinux: glibc 2.12+ x86-64

sourmash-2.3.0-cp27-cp27m-manylinux1_x86_64.whl (729.1 kB view hashes)

Uploaded CPython 2.7m

sourmash-2.3.0-cp27-cp27m-macosx_10_6_intel.whl (185.2 kB view hashes)

Uploaded CPython 2.7m macOS 10.6+ intel

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page