Skip to main content

No project description provided

Project description

PyPI - Python Version CI Coverage Status Codacy Badge CodeQL Ruff DOI

diverse-seq provides alignment-free algorithms to facilitate phylogenetic workflows

diverse-seq implements computationally efficient alignment-free algorithms that enable efficient prototyping for phylogenetic workflows. It can accelerate parameter selection searches for sequence alignment and phylogeny estimation by identifying a subset of sequences that are representative of the diversity in a collection. We show that selecting representative sequences with an entropy measure of k-mer frequencies correspond well to sampling via conventional genetic distances. The computational performance is linear with respect to the number of sequences and can be run in parallel. Applied to a collection of 10.5k whole microbial genomes on a laptop took ~12 minutes to prepare the data and ~2 minutes to select 100 representatives. diverse-seq can further boost the performance of phylogenetic estimation by providing a seed phylogeny that can be further refined by a more sophisticated algorithm. For ~1k whole microbial genomes on a laptop, it takes ~1.8 minutes to estimate a bifurcating tree from mash distances.

You can read more about the methods implemented in diverse-seq in the paper here.

The user documentation is here.

📣 Announcements 📣

Reimplemented core routines in Rust!

The prep step takes approximately the same amount of time. Sampling divergent sequences is ~2x faster 🏎️🎉.

Warning -- backwards incompatible changes

The Rust rewrite was accompanied by a switch to using the Zarr storage format instead of HDF5. The output file from dvs prep now has the suffix .dvseqsz instead of .dvseq. Old-format files are not compatible with this version.

Installation

We recommend installing diverse-seq from PyPI as follows

pip install "diverse-seq[extra]"

for the full jupyter experience.

For command line only usage, install as follows

pip install diverse-seq

NOTE If you experience any errors during installation, we recommend using uv pip. This command provides much better error messages than the standard pip command. If you cannot resolve the installation problem, please open an issue on the GitHub repository.

Using uv

Speaking of uv, it provides a simplified approach to install dvs as a command-line only tool as

uv tool install diverse-seq

Usage in this case is then

uvx --from diverse-seq dvs

Dependencies

For a full listing of dependencies, see the pyproject.toml file.

The command line interface

dvs is the command line interface for diverse-seq.

The `dvs` subcommands
Usage: dvs [OPTIONS] COMMAND [ARGS]...

  dvs -- alignment free detection of the most diverse sequences using JSD

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  demo-data  Export a demo sequence file
  prep       Writes processed sequences to <Zarr Storage>.dvseqsz.
  max        Identify the seqs that maximise average delta JSD
  nmost      Identify n seqs that maximise average delta JSD
  ctree      Quickly compute a cluster tree based on kmers for a collection...

The Python API

We make comparable capabilities available as cogent3 apps. The main difference is the app instances directly operate on, and return, cogent3 sequence collections. See the docs for demonstrations of how to use the apps.

Project Information

diverse-seq is released under the BSD-3 license. If you want to contribute to the diverse-seq project (and we hope you do! 😇) the code of conduct and other useful developer information is available on the wiki.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diverse_seq-2026.4.21.tar.gz (572.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

diverse_seq-2026.4.21-cp314-cp314-win_amd64.whl (2.6 MB view details)

Uploaded CPython 3.14Windows x86-64

diverse_seq-2026.4.21-cp314-cp314-manylinux_2_34_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64

diverse_seq-2026.4.21-cp314-cp314-macosx_11_0_arm64.whl (2.9 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

diverse_seq-2026.4.21-cp313-cp313-win_amd64.whl (2.6 MB view details)

Uploaded CPython 3.13Windows x86-64

diverse_seq-2026.4.21-cp313-cp313-manylinux_2_34_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

diverse_seq-2026.4.21-cp313-cp313-macosx_11_0_arm64.whl (2.9 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

diverse_seq-2026.4.21-cp312-cp312-win_amd64.whl (2.6 MB view details)

Uploaded CPython 3.12Windows x86-64

diverse_seq-2026.4.21-cp312-cp312-manylinux_2_34_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

diverse_seq-2026.4.21-cp312-cp312-macosx_11_0_arm64.whl (2.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

diverse_seq-2026.4.21-cp311-cp311-win_amd64.whl (2.6 MB view details)

Uploaded CPython 3.11Windows x86-64

diverse_seq-2026.4.21-cp311-cp311-manylinux_2_34_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

diverse_seq-2026.4.21-cp311-cp311-macosx_11_0_arm64.whl (2.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file diverse_seq-2026.4.21.tar.gz.

File metadata

  • Download URL: diverse_seq-2026.4.21.tar.gz
  • Upload date:
  • Size: 572.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for diverse_seq-2026.4.21.tar.gz
Algorithm Hash digest
SHA256 d8a61b811172d91b2d94a5d03c88143ec289f7fadd185c745aa297e55f112138
MD5 c8eaa6feabff93e4acfbe65b98351f12
BLAKE2b-256 dbe9e2c701007969b657093d7acb455c350173df04434e75578d2915cbe4e140

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21.tar.gz:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 ba1dd4ece3d4d6d71f2ecf76da4cbf87c5d9b06c9fde7f211f7d349974ae7561
MD5 a26efb92e28e3649c54e583330ab7c05
BLAKE2b-256 1685fff4d47c2272ff928d2ec361b4b41b3bea71a3d817aa318ba2a2e2c6a15a

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp314-cp314-win_amd64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp314-cp314-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp314-cp314-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 715461eb7133e5b1399506920d1403362fddd221f7bda22ef9eea71dd1ed51cf
MD5 356f7443425a02f0691f6a16de2f3f47
BLAKE2b-256 ea9c0b72ce58ae4f0bf7729574148b7a684d8cf73ce2307097da367b09e8190d

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp314-cp314-manylinux_2_34_x86_64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3fd96eb657adaa1362240b580a05a775aad9d3c29f7f564b06dec2183610d8cd
MD5 a3a30a9f001d2312ff59d69d629595f4
BLAKE2b-256 f8b572cc4f7f6669b6e0e133de7dcacf14d55ed736845c04e192e6e112239582

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 ac026ed3f5a35a1544fcd78f386f74f58537013fc93b3cbd7561c2ed6202c25f
MD5 cce57ec4f6d038308017a41ba5d7f174
BLAKE2b-256 6486143a183dcbe2ef8917ce7de276b7ac27b31bb1e2433167a1e67409144af2

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp313-cp313-win_amd64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 57d399c7481d2f2cb5c6a72c9eb9809021098164a5de7d37cae315a95be315f0
MD5 11499298887a94684ac81d145d2a53af
BLAKE2b-256 8bba321197c41887776bb249b34a5c1402a5141405b02ddc081e0b5e38b2f490

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 29c25da74a744dc884f9c0bb5fe07258ca837094e561fb929e17dc815eb7cccb
MD5 f1423af92b245104490f6f6fc2ef572f
BLAKE2b-256 d8a794ce5595ce026913c3eb7a9d7e24f930836a9b59a6eb84b1a7059d4c0b4d

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 3e635fd76774e81dcb0ca563433a59b366c95be6e8b3cb4c74c8ee14d5e0f574
MD5 28f8f864a2950bf235c936a1df290e15
BLAKE2b-256 b984eb7499b6628734343e6d52fa9abcaba37d94471678047f326b38dfee5547

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp312-cp312-win_amd64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 81901fc62ef0e47d93dfba91466cc137d64c6c815c29f2af45d4f3030a4ded7d
MD5 7167d6339ef83fb2b87bb4ed6d330edc
BLAKE2b-256 c3abe05bb186cd15896f52797511b214f990ed6dcfe493aa8640a9cb04b6623c

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6c59bf0b4b76e9b27980aa700871ea57708b16be2b362601d34eb153315dd5be
MD5 a5f51bbbbc4b026696380b9c535d9a6c
BLAKE2b-256 839de8689215bf5cdd672fb149fa47c24f62c3ecb9ffc6c2f859d88e02697464

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 184d721f2c812af5cf468d8cf3698f5d2e587539c2c4c86b1895f5e0114f1acf
MD5 ce769df63d5d0f3a3936b14301a01be8
BLAKE2b-256 ce399f76c49cc6a64be885235e541600f9e3d3eff769339f43ea9002d91a1407

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp311-cp311-win_amd64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 24465f3377c77421c5537bc16195eb55b61711a61b95d21f23893c9ec5fe579a
MD5 e3352fe389012d5d17c2d17fe7c9f2f9
BLAKE2b-256 a1a19225ec671097f183e3404957c4b1b16d7424bd82621d5835bce72be672dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp311-cp311-manylinux_2_34_x86_64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diverse_seq-2026.4.21-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for diverse_seq-2026.4.21-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 40ea06cbf2dbe4adae98438c18d6b9388e0cbfc7a292244e3c87c44d95a83dbf
MD5 db91f3d280f894694c588508d2d17ec2
BLAKE2b-256 bc8a7010a1fb29332c50bdd5568cc80630599e5f33509f91906d1a63cec391c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for diverse_seq-2026.4.21-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on HuttleyLab/DiverseSeq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page