Skip to main content

tools for comparing biological sequences with k-mer sketches

Project description

sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.

Project Status: Active – The project has reached a stable, usable state and is being actively developed. License: 3-Clause BSD Documentation Gitter

DOI pyOpenSci

Bioconda install PyPI Conda Platforms

Python 3.10 Python 3.11 Python 3.12 Build Status codecov

Usage:

sourmash sketch dna *.fq.gz
sourmash compare *.sig -o distances.cmp -k 31
sourmash plot distances.cmp

sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027):.

The latest major release is sourmash v4, which has several command-line and Python incompatibilities with previous versions. Please visit our migration guide to upgrade!


sourmash is a k-mer analysis multitool, and we aim to provide stable, robust programmatic and command-line APIs for a variety of sequence comparisons. Some of our special sauce includes:

  • FracMinHash sketching, which enables accurate comparisons (including ANI) between data sets of different sizes
  • sourmash gather, a combinatorial k-mer approach for more accurate metagenomic profiling

Please see the sourmash publications for details.

The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Maintainers: C. Titus Brown (@ctb), Luiz C. Irber, Jr (@luizirber), and N. Tessa Pierce-Ward (@bluegenes).

sourmash was initially developed by the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine, and now includes contributions from the global research and developer community.

Installation

We recommend using conda-forge to install sourmash:

conda install -c conda-forge sourmash-minimal

This will install the latest stable version of sourmash 4.

You can also use pip to install sourmash:

pip install sourmash

A quickstart tutorial is available.

Requirements

sourmash runs under Python 3.10 and later on Windows, Mac OS X, and Linux. The base requirements are screed, cffi, numpy, matplotlib, and scipy. Conda will install everything necessary, and is our recommended installation method (see below).

Installation with conda

conda-forge is a community maintained channel for the conda package manager. installing conda, you can install sourmash by running:

$ conda create -n sourmash_env -c conda-forge sourmash-minimal
$ conda activate sourmash_env
$ sourmash --help

which will install the latest released version.

Support

For questions, please open an issue on Github, or ask in our chat.

Development

Development happens on github at sourmash-bio/sourmash.

sourmash is developed in Python and Rust, and you will need a Rust environment to build it; see the developer notes for our suggested development setup.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the Python library and command-line interface code.

The src/core/ directory contains the Rust library implementing core functionality.

Tests require py.test and can be run with make test.

Please see the developer notes for more information on getting set up with a development environment.

CTB Jan 2024

Release history Release notifications | RSS feed

This version

4.8.8

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourmash-4.8.8.tar.gz (13.3 MB view details)

Uploaded Source

Built Distributions

sourmash-4.8.8-py3-none-win_amd64.whl (1.9 MB view details)

Uploaded Python 3 Windows x86-64

sourmash-4.8.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.8 MB view details)

Uploaded Python 3 manylinux: glibc 2.17+ x86-64

sourmash-4.8.8-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (4.2 MB view details)

Uploaded Python 3 manylinux: glibc 2.17+ ppc64le

sourmash-4.8.8-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.3 MB view details)

Uploaded Python 3 manylinux: glibc 2.17+ ARM64

sourmash-4.8.8-py3-none-macosx_11_0_arm64.whl (2.3 MB view details)

Uploaded Python 3 macOS 11.0+ ARM64

sourmash-4.8.8-py3-none-macosx_10_14_x86_64.whl (2.6 MB view details)

Uploaded Python 3 macOS 10.14+ x86-64

File details

Details for the file sourmash-4.8.8.tar.gz.

File metadata

  • Download URL: sourmash-4.8.8.tar.gz
  • Upload date:
  • Size: 13.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for sourmash-4.8.8.tar.gz
Algorithm Hash digest
SHA256 667da4b014f003032bdffa43318bb5668edbbc55055fe109dab959a914d69842
MD5 ee598d866bf3e5c75dcd3d5682a2339b
BLAKE2b-256 45db7ca1ce1eedfe86f0094c16809cb5ac444733ea7dc5ad7f3703e4fd607eef

See more details on using hashes here.

File details

Details for the file sourmash-4.8.8-py3-none-win_amd64.whl.

File metadata

  • Download URL: sourmash-4.8.8-py3-none-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for sourmash-4.8.8-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 b366a43e72e2a82fdd3cb11f96413df7b6a06b5d5513e61e605675fc205c409a
MD5 accb122662e0abb6a78f716ec6bc97e8
BLAKE2b-256 1dd1108db74fbcc8e473a5067c98095a770360e66768489e049f0135a4c06001

See more details on using hashes here.

File details

Details for the file sourmash-4.8.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fef5bb197f0f15b04e1d6fd7191630e8e555dfc0e80137a070f6c6b4fed04aa1
MD5 875f4ad10eabf348a70ff8e9e79f04bc
BLAKE2b-256 edc83461d5ac61e6da92e9a48649c8743ffa413648e5d90d1552b4455b12f82d

See more details on using hashes here.

File details

Details for the file sourmash-4.8.8-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for sourmash-4.8.8-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 c0ab053336341cdec0bb42c2bc6326ebbbbaa2ac12d6ad7a0dd49b1d96b81380
MD5 01a86d42577518f8361f21b697e7ad60
BLAKE2b-256 e51459e0e24aa2e62a163cb07ebc6da038058846bf7df06f1e6a25d58e2c1354

See more details on using hashes here.

File details

Details for the file sourmash-4.8.8-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.8-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 560187f1a34116b9abb4df1536a0c599827af1a942e48cf8cef296a99c6ec628
MD5 df8c791b7e270912e1f6eeb7cb350a3f
BLAKE2b-256 1b71e9443277ce1fe40cf632829415385667b97c1a293ee0566c5a021adb6491

See more details on using hashes here.

File details

Details for the file sourmash-4.8.8-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.8-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9b6d76d5706fc0f144e6cc1057941d3bd2779176867feba6e04e7f03d1610b27
MD5 a1fcd92679cef41ae848ec969d6463b8
BLAKE2b-256 8b9f3bc03de95d816ba738fe2d3013a7102107895d0982f898343d1f9ddaed7c

See more details on using hashes here.

File details

Details for the file sourmash-4.8.8-py3-none-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.8-py3-none-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6733ffae9299600b57a18ce9c8a06a8da1618bc90506e6c4ad843f6fa3fb6fdf
MD5 26cb1f6c0e62147607629e2acf4ebaf4
BLAKE2b-256 0147ab15443978de816d423d13595930db8e1c85eaf4c727811cf5528f0c24e5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page