tools for comparing DNA sequences with MinHash sketches

# sourmash

[![Build Status](](

Compute MinHash signatures for DNA sequences.


sourmash compute *.fq.gz
sourmash compare *.sig -o distances
sourmash plot distances

We have demo notebooks on binder that you can interact with:



The name is a riff off of [Mash](,
combined with @ctb's love of whiskey.
([Sour mash]( is used in
making whiskey.)

Authors: [C. Titus Brown]( ([@ctb]( and Luiz C. Irber, Jr.

sourmash is a product of the
[Lab for Data-Intensive Biology]( at the
[UC Davis School of Veterinary Medicine](

## Installation

You can do:

pip install sourmash

sourmash runs under both Python 2.7.x and Python 3.5. The base
requirements are screed and PyYAML, together with a C++ development
environment and the CPython development headers and libraries (for the
C++ extension).

The comparison code (`sourmash compare`) uses numpy, and the plotting
code uses matplotlib and scipy, but most of the code is usable without

## Support

Please ask questions and files issues
[on Github]( The developers
sometimes hang out [on gitter](

## Development

Development happens on github at

`sourmash` is the main command-line entry point; run it for help.

`sourmash_lib/` contains the library code.

Tests require py.test and can be run with `make test`.




