Skip to main content

No project description provided

Project description

PyPI - Python Version CI Coverage Status Codacy Badge CodeQL Ruff DOI

diverse-seq provides alignment-free algorithms to facilitate phylogenetic workflows

diverse-seq implements computationally efficient alignment-free algorithms that enable efficient prototyping for phylogenetic workflows. It can accelerate parameter selection searches for sequence alignment and phylogeny estimation by identifying a subset of sequences that are representative of the diversity in a collection. We show that selecting representative sequences with an entropy measure of k-mer frequencies correspond well to sampling via conventional genetic distances. The computational performance is linear with respect to the number of sequences and can be run in parallel. Applied to a collection of 10.5k whole microbial genomes on a laptop took ~12 minutes to prepare the data and ~2 minutes to select 100 representatives. diverse-seq can further boost the performance of phylogenetic estimation by providing a seed phylogeny that can be further refined by a more sophisticated algorithm. For ~1k whole microbial genomes on a laptop, it takes ~1.8 minutes to estimate a bifurcating tree from mash distances.

You can read more about the methods implemented in diverse-seq in the paper here.

The user documentation is here.

📣 Announcements 📣

Reimplemented core routines in Rust!

The prep step takes approximately the same amount of time. Sampling divergent sequences is ~2x faster 🏎️🎉.

Warning -- backwards incompatible changes

The Rust rewrite was accompanied by a switch to using the Zarr storage format instead of HDF5. The output file from dvs prep now has the suffix .dvseqsz instead of .dvseq. Old-format files are not compatible with this version.

Installation

We recommend installing diverse-seq from PyPI as follows

pip install "diverse-seq[extra]"

for the full jupyter experience.

For command line only usage, install as follows

pip install diverse-seq

NOTE If you experience any errors during installation, we recommend using uv pip. This command provides much better error messages than the standard pip command. If you cannot resolve the installation problem, please open an issue on the GitHub repository.

Using uv

Speaking of uv, it provides a simplified approach to install dvs as a command-line only tool as

uv tool install diverse-seq

Usage in this case is then

uvx --from diverse-seq dvs

Dependencies

For a full listing of dependencies, see the pyproject.toml file.

The command line interface

dvs is the command line interface for diverse-seq.

The `dvs` subcommands
Usage: dvs [OPTIONS] COMMAND [ARGS]...

  dvs -- alignment free detection of the most diverse sequences using JSD

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  demo-data  Export a demo sequence file
  prep       Writes processed sequences to <Zarr Storage>.dvseqsz.
  max        Identify the seqs that maximise average delta JSD
  nmost      Identify n seqs that maximise average delta JSD
  ctree      Quickly compute a cluster tree based on kmers for a collection...

The Python API

We make comparable capabilities available as cogent3 apps. The main difference is the app instances directly operate on, and return, cogent3 sequence collections. See the docs for demonstrations of how to use the apps.

Project Information

diverse-seq is released under the BSD-3 license. If you want to contribute to the diverse-seq project (and we hope you do! 😇) the code of conduct and other useful developer information is available on the wiki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diverse_seq-2026.3.1.tar.gz (571.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

diverse_seq-2026.3.1-cp314-cp314-win_amd64.whl (2.4 MB view details)

Uploaded CPython 3.14Windows x86-64

diverse_seq-2026.3.1-cp314-cp314-manylinux_2_34_x86_64.whl (3.1 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

diverse_seq-2026.3.1-cp314-cp314-macosx_11_0_arm64.whl (2.6 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

diverse_seq-2026.3.1-cp313-cp313-win_amd64.whl (2.4 MB view details)

Uploaded CPython 3.13Windows x86-64

diverse_seq-2026.3.1-cp313-cp313-manylinux_2_34_x86_64.whl (3.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

diverse_seq-2026.3.1-cp313-cp313-macosx_11_0_arm64.whl (2.6 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

diverse_seq-2026.3.1-cp312-cp312-win_amd64.whl (2.4 MB view details)

Uploaded CPython 3.12Windows x86-64

diverse_seq-2026.3.1-cp312-cp312-manylinux_2_34_x86_64.whl (3.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

diverse_seq-2026.3.1-cp312-cp312-macosx_11_0_arm64.whl (2.6 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

diverse_seq-2026.3.1-cp311-cp311-win_amd64.whl (2.4 MB view details)

Uploaded CPython 3.11Windows x86-64

diverse_seq-2026.3.1-cp311-cp311-manylinux_2_34_x86_64.whl (3.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

diverse_seq-2026.3.1-cp311-cp311-macosx_11_0_arm64.whl (2.6 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file diverse_seq-2026.3.1.tar.gz.

File metadata

  • Download URL: diverse_seq-2026.3.1.tar.gz
  • Upload date:
  • Size: 571.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for diverse_seq-2026.3.1.tar.gz
Algorithm Hash digest
SHA256 905e60ed4ff645f7fae17e2a36695e9a9bdd78253a70e5439ae2f48f5554e596
MD5 c69268ab057990320b581c83d4bffbd4
BLAKE2b-256 155abad7f694c4dc0f7b8461305145bb2856422f6b6a71fa467a0d7951418a04

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1.tar.gz:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 6d23390bf09da1944f39deec48a32582c43426ac6d1db08ce95ea3c6162f6c90
MD5 795fb69e429812da0373c41c65347bb9
BLAKE2b-256 03937b90d1c7a52ed344a55bc04cd0d28197e3885d9676d65949015e012dc13f

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp314-cp314-win_amd64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 d5f735b3b6d456abb498ea6cca8ae6bf87e085c90cb4af2d7afb5e43bd2c51eb
MD5 b336a59885131a749a8aae36a8d31357
BLAKE2b-256 84a79fac5aa42e0e2c267c592f8285b170d5c63e44016d55dc912f81f6e125a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp314-cp314-manylinux_2_34_x86_64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a0be84f363c916cd7efd519a103ceddf833b02aedf0e86eb71d2da3b0529f35c
MD5 204047c4a41bfdcb269361326d97bfbb
BLAKE2b-256 360c81aa220a95532fb5f257b8fcbe7307557dcf9db3c605c03a6929a26b08db

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 10b2ee9c8558853f1f1726fda907153a2be32e770cb8eca15cf7262618fcd56b
MD5 f60fab27585bcfb47fdf295d53bddafe
BLAKE2b-256 b5d4db40aa9ab8665ee366debed18caf8b7453ba253d92034582e0aa6ef62caf

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp313-cp313-win_amd64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 efd7bdea1bca2ef720fa3fac94b15345adb591bb69388ab7e19021294ee29d9d
MD5 6981bf49da892955a4f2d9e6b118ec90
BLAKE2b-256 f916ff4ce5c8ed4750ed1f83754946e05ba72421b65ec06b58bc0aa980b8e13a

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 999bf14966c213aa20c1f51b9ab2fb660a96d06946717a2158679937cbb25d8f
MD5 cb4b0f1a091d3baaaec95a1ff3a41a58
BLAKE2b-256 6a67d6ebac75931adaddcba1e55da7a9f8309af05dba5567db7e639e03934d88

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 545c34f57a85e942a9d5e2e8c4c334fcc8f63b26cc592bc533d5e18a2a3ccfc9
MD5 532866acf92eb624ef6a001ac70d00f8
BLAKE2b-256 8a1039207cc1e524bb4a49312b2691278c196f5675cf3c17331689610fb096c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp312-cp312-win_amd64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 9ef8400861a10d9bdde5bbd2fb74eab6f7f227ddcaf34148f8e8f6f2044537b6
MD5 d446d4e3119502e517b22f5c9744a958
BLAKE2b-256 e3225a3ff5549b29767209ef2f19ce4b9959d255eb8e73b1d3865df04e476b03

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 55bdbb82999cafbeb05cd0118b97680ec9bb4a0f56e185cc8778cb17d78c6d93
MD5 b0ad9b47045f2ebb14575ec885065034
BLAKE2b-256 73f811110fffc0a0748918194e40cda8f713ae6a703cee6b91d0eacc7a9485a2

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 45544b65737b2adba64ae45337eb4b8815801a5afe82a3a9025732bfd18dc80f
MD5 b8e839bdbeed9aab9f39d1b9a8f08215
BLAKE2b-256 29de750a0f6111e0264e134f9892d51cb6ffd430bb3031c9fe3b943c0ae2de6b

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp311-cp311-win_amd64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 0695e10c5fea2a0bce9f6c515e3980966a0fcf5d99a381b56b29526cd054582d
MD5 7b6412b9ea55cfc5566bfcff7fe1871d
BLAKE2b-256 28a53ef0e695a611be7f318684ba919faef346e0bf3a948a3b509af1bdfc4891

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp311-cp311-manylinux_2_34_x86_64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.3.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.3.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bbce25c34fa0f21e08b0d142663c4c0e132958e851c27d4933d1eaa2ffbbf193
MD5 190916e4b055aff923b06d66e2868273
BLAKE2b-256 68d1c1551a8f6e563882629f5d69e00aeb19ec89366ba841d64c8030afa3082a

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.3.1-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page