Skip to main content

tools for comparing biological sequences with k-mer sketches

Project description

sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.

Project Status: Active – The project has reached a stable, usable state and is being actively developed. License: 3-Clause BSD Documentation Gitter DOI

Bioconda install PyPI Conda Platforms

Python 3.10 Python 3.11 Python 3.12 Build Status codecov

Usage:

sourmash sketch dna *.fq.gz
sourmash compare *.sig -o distances.cmp -k 31
sourmash plot distances.cmp

sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027):.

The latest major release is sourmash v4, which has several command-line and Python incompatibilities with previous versions. Please visit our migration guide to upgrade!


sourmash is a k-mer analysis multitool, and we aim to provide stable, robust programmatic and command-line APIs for a variety of sequence comparisons. Some of our special sauce includes:

  • FracMinHash sketching, which enables accurate comparisons (including ANI) between data sets of different sizes
  • sourmash gather, a combinatorial k-mer approach for more accurate metagenomic profiling

Please see the sourmash publications for details.

The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Maintainers: C. Titus Brown (@ctb), Luiz C. Irber, Jr (@luizirber), and N. Tessa Pierce-Ward (@bluegenes).

sourmash was initially developed by the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine, and now includes contributions from the global research and developer community.

Installation

We recommend using conda-forge to install sourmash:

conda install -c conda-forge sourmash-minimal

This will install the latest stable version of sourmash 4.

You can also use pip to install sourmash:

pip install sourmash

A quickstart tutorial is available.

Requirements

sourmash runs under Python 3.10 and later on Windows, Mac OS X, and Linux. The base requirements are screed, cffi, numpy, matplotlib, and scipy. Conda will install everything necessary, and is our recommended installation method (see below).

Installation with conda

conda-forge is a community maintained channel for the conda package manager. installing conda, you can install sourmash by running:

$ conda create -n sourmash_env -c conda-forge sourmash-minimal
$ conda activate sourmash_env
$ sourmash --help

which will install the latest released version.

Support

For questions, please open an issue on Github, or ask in our chat.

Development

Development happens on github at sourmash-bio/sourmash.

sourmash is developed in Python and Rust, and you will need a Rust environment to build it; see the developer notes for our suggested development setup.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the Python library and command-line interface code.

The src/core/ directory contains the Rust library implementing core functionality.

Tests require py.test and can be run with make test.

Please see the developer notes for more information on getting set up with a development environment.

CTB Jan 2024

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourmash-4.8.6.tar.gz (13.3 MB view details)

Uploaded Source

Built Distributions

sourmash-4.8.6-py3-none-win_amd64.whl (1.9 MB view details)

Uploaded Python 3Windows x86-64

sourmash-4.8.6-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

sourmash-4.8.6-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (4.2 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ppc64le

sourmash-4.8.6-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.3 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

sourmash-4.8.6-py3-none-macosx_11_0_arm64.whl (2.3 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

sourmash-4.8.6-py3-none-macosx_10_14_x86_64.whl (2.6 MB view details)

Uploaded Python 3macOS 10.14+ x86-64

File details

Details for the file sourmash-4.8.6.tar.gz.

File metadata

  • Download URL: sourmash-4.8.6.tar.gz
  • Upload date:
  • Size: 13.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for sourmash-4.8.6.tar.gz
Algorithm Hash digest
SHA256 5a09c5d12cb07f8d73eb60db8dbd30e4e74aabf53a718610ddb5e6c91b341dca
MD5 ed86dc8157966dbe0b72de21a53ce09f
BLAKE2b-256 164d545e575eb148fec48d6b7e085ea980cce35b448b2985239e992057a4baa5

See more details on using hashes here.

File details

Details for the file sourmash-4.8.6-py3-none-win_amd64.whl.

File metadata

  • Download URL: sourmash-4.8.6-py3-none-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for sourmash-4.8.6-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 c9bda2e971a19abcedaf9e42a39c4441c4e031b0e97ce3541ea6af71314e61a2
MD5 c82d45a29888157ab986d7e835e01c90
BLAKE2b-256 16119d4e4b91b13bc306787abf2d9413c76317a4e89568d502c3c12f81678329

See more details on using hashes here.

File details

Details for the file sourmash-4.8.6-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.6-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8e15a51b84467f04d1d47a27710a2320b3cf019650e4c0c26abee12ab1ec71eb
MD5 a0fcffc515c001a1f25db8435efe8325
BLAKE2b-256 fdf835ec688d89e0a3aa073459cc8df4fa2720f66e830ac51d61ef71d987d0d0

See more details on using hashes here.

File details

Details for the file sourmash-4.8.6-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for sourmash-4.8.6-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 79994cbda8692b51ea7cde5c52466b01fdef75cb2bd07da215d247bca4860558
MD5 396298ab1652be154ea3fbbd0c8755ad
BLAKE2b-256 cfb7827268e01f96da8e01322aa0094b7e96ae672fb70976288b4def820776a7

See more details on using hashes here.

File details

Details for the file sourmash-4.8.6-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.6-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f587838a0e1f526dba7a2757f1e699cd72e1b7779ed3a6e6078c2092bcc8596a
MD5 46c348bfbef9b5b5614762ac17b9e32c
BLAKE2b-256 f2bafdee2c49c3e801f7f2e8d8a50b47b97c8f1dacd9a1ca9fa45a13f7b7b3d6

See more details on using hashes here.

File details

Details for the file sourmash-4.8.6-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.6-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 46a18ffae995adda42c6e1ec1f738f81236dcc50758e02c9489caf96f969f122
MD5 c6b67d20fee8a40c672fb927415f546d
BLAKE2b-256 b68bed559bffd75a91637c64509a48ee039bd05c605c810fe4b7ed96f45efff2

See more details on using hashes here.

File details

Details for the file sourmash-4.8.6-py3-none-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.6-py3-none-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2dd5c89f73e12e196488508010a6fbc90651c235a38418d7a24f274f3bcbc470
MD5 ea8048aab461abf2deea2c7bb47bec44
BLAKE2b-256 b0bb2b1161dbdbb7197e41e3c10c42f18f772d19660275d5cec35e98c99d983c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page