Skip to main content

Phylo2Vec: integer vector representation of binary (phylogenetic) trees

Project description

Phylo2Vec

PyPI version Documentation Zenodo JOSS

LGPL-3.0 License

pre-commit.ci status CI Python CI Rust CI R

Phylo2Vec (or phylo2vec) is a high-performance software package for encoding, manipulating, and analysing binary phylogenetic trees. At its core, the package contains representation of binary trees, which defines a bijection from any tree topology with 𝑛 leaves into an integer vector of size 𝑛 − 1. Compared to the traditional Newick format, phylo2vec was designed with fast sampling, fast conversion/compression from Newick-format trees to the Phylo2Vec format, and rapid tree comparison in mind.

This current version features a core implementation in Rust, providing significant performance improvements and memory efficiency while remaining available in Python (superseding the version described in the original paper) and R via dedicated wrappers, making it accessible to a broad audience in the bioinformatics community.

Link to the paper: https://doi.org/10.1093/sysbio/syae030

Installation

Python package

Pip

The easiest way to install the standard Python package is using pip:

pip install phylo2vec

Several optimization schemes based on Phylo2Vec are also available, but require extra dependencies. (See this notebook for a demo). To avoid bloating the standard package, these dependencies must be installed separately. To do so, run:

pip install "phylo2vec[opt]"

Manual installation

  • We recommend setting up pixi package management tool.
  • Clone the repository and install using pixi:
git clone https://github.com/sbhattlab/phylo2vec.git
cd phylo2vec
pixi run -e py-phylo2vec install-python

This will compile and install the package as the core functionality is written in Rust.

Installing R package

Option 1: from a release (Windows, Mac, Ubuntu >= 22.04)

Note: Pre-built Mac binaries are available for Apple Silicon (ARM64) and Intel (x86_64, macOS 15+ only). Intel Mac users on older macOS versions should use Option 2 or 3.

Retrieve one of the compiled binaries from the releases that fits your OS. Once the file is downloaded, simply run install.packages in your R command line.

install.packages("/path/to/package_file", repos = NULL, type = 'source')

Option 2: using devtools

⚠️ This requires installing Rust to build the core package.

devtools::install_github("sbhattlab/phylo2vec", subdir="./r-phylo2vec", build = FALSE)

Note: to download a specific version, use:

devtools::install_github("sbhattlab/phylo2vec@vX.Y.Z", subdir="./r-phylo2vec", build = FALSE)

Option 3: manual installation

⚠️ This requires installing Rust to build the core package.

Clone the repository and run the following install.packages in your R command line.

Note: to download a specific version, you can use git checkout to a desired tag.

git clone https://github.com/sbhattlab/phylo2vec
cd phylo2vec
install.packages("./r-phylo2vec", repos = NULL, type = 'source')

Basic Usage

Python

Conversion between Newick and vector representations

import numpy as np
from phylo2vec import from_newick, to_newick

# Convert a vector to Newick string
v = np.array([0, 1, 2, 3, 4])
newick = to_newick(v)  # '(0,(1,(2,(3,(4,5)6)7)8)9)10;'

# Convert Newick string back to vector
v_converted = from_newick(newick)  # array([0, 1, 2, 3, 4], dtype=int16)

Tree Manipulation

from phylo2vec.utils.vector import add_leaf, remove_leaf, reroot_at_random

# Add a leaf to an existing tree
v_new = add_leaf(v, 2)  # Add a leaf to the third position

# Remove a leaf
v_reduced = remove_leaf(v, 1)  # Remove the second leaf

# Random rerooting
v_rerooted = reroot_at_random(v)

Optimization

To run the hill climbing-based optimisation scheme presented in the original Phylo2Vec paper, run:

# A hill-climbing scheme to optimize Phylo2Vec vectors
from phylo2vec.opt import HillClimbing

hc = HillClimbing(verbose=True)
hc_result = hc.fit("/path/to/your_fasta_file.fa")

Command-line interface (CLI)

We also provide a command-line interface for quick experimentation on phylo2vec-derived objects.

To see the available functions, run:

phylo2vec --help

Examples:

phylo2vec samplev 5 # Sample a vector with 5 leaves
phylo2vec samplem 5 # Sample a matrix with 5 leaves
phylo2vec from_newick '((0,1),2);' # Convert a Newick to a vector
phylo2vec from_newick '((0:0.3,1:0.1):0.5,2:0.4);' # Convert a Newick to a matrix
phylo2vec to_newick 0,1,2 # Convert a vector to Newick
phylo2vec to_newick $'0.0,1.0,2.0\n0.0,3.0,4.0' # Convert a matrix to Newick

Datasets

Description of the datasets as well as download links are available in in the datasets directory.

Datasets for which a FASTA file is available can be downloaded and loaded into Biopython:

from phylo2vec.datasets import load_alignment

load_alignment("zika")

Readily downloadable datasets can be listed using:

from phylo2vec.datasets import list_datasets

list_datasets()

Documentation

For comprehensive documentation, tutorials, and API reference, visit: https://phylo2vec.readthedocs.io

How to Contribute (issues, feature requests...)

Found a bug or want a new feature? We welcome contributions to phylo2vec! 🤗 Feel free to report any bugs or feature requests on our Issues page. If you want to contribute directly to the project, fork the repository, create a new branch, and open a pull request (PR) on our Pull requests page.

Please refer to our Contributing guidelines for more details how to report bugs, request features, or submit code improvements.

Thanks to all our contributors so far!

Contributors

License

This project is distributed under the GNU Lesser General Public License v3.0 (LGPL).

Citation

If you use Phylo2Vec in your research, please cite:

@article{10.1093/sysbio/syae030,
    author = {Penn, Matthew J and Scheidwasser, Neil and Khurana, Mark P and Duchêne, David A and Donnelly, Christl A and Bhatt, Samir},
    title = {Phylo2Vec: a vector representation for binary trees},
    journal = {Systematic Biology},
    year = {2024},
    month = {03},
    doi = {10.1093/sysbio/syae030},
    url = {https://doi.org/10.1093/sysbio/syae030},
}

If you use the software, please cite:

@article{10.21105/joss.09040,
    doi = {10.21105/joss.09040},
    url = {https://doi.org/10.21105/joss.09040},
    year = {2025},
    publisher = {The Open Journal},
    volume = {10},
    number = {114},
    pages = {9040},
    author = {Scheidwasser, Neil and Nag, Ayush and Penn, Matthew J. and Jakob, Anthony and Andersen, Frederik Mølkjær and Khurana, Mark Poulsen and Setiawan, Landung and Duchêne, David A. and Bhatt, Samir},
    title = {phylo2vec: a library for vector-based phylogenetic tree manipulation},
    journal = {Journal of Open Source Software}
}

Related Work

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phylo2vec-1.7.0.tar.gz (130.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

phylo2vec-1.7.0-cp310-abi3-win_amd64.whl (918.9 kB view details)

Uploaded CPython 3.10+Windows x86-64

phylo2vec-1.7.0-cp310-abi3-win32.whl (832.0 kB view details)

Uploaded CPython 3.10+Windows x86

phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_i686.whl (1.4 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ i686

phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_armv7l.whl (1.4 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARMv7l

phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ s390x

phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ppc64le

phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARMv7l

phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

phylo2vec-1.7.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.5+ i686

phylo2vec-1.7.0-cp310-abi3-macosx_11_0_arm64.whl (1.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

phylo2vec-1.7.0-cp310-abi3-macosx_10_12_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file phylo2vec-1.7.0.tar.gz.

File metadata

  • Download URL: phylo2vec-1.7.0.tar.gz
  • Upload date:
  • Size: 130.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.11.5

File hashes

Hashes for phylo2vec-1.7.0.tar.gz
Algorithm Hash digest
SHA256 4d38b72cbfa3db6c66e12ddb48488ed0b4f0df118cc261446a91dba2d6c0ecb3
MD5 0f88b4992be579efe44417333b4a161e
BLAKE2b-256 327afa2de8a2c413e52800e235e715f5d4ac259a2a5f0a3c682b83b526e16d8e

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: phylo2vec-1.7.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 918.9 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.11.5

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 3ccbacfb945843ad9196ee01f099c420d4e0e6d42bbe7cdd8a60948753aded62
MD5 70b28ede4f55133fb86fbe95515d3fb6
BLAKE2b-256 33295a31f7ca6b1f389b8c74754e6de35cb9d1fc0c3010fc1ff28d4e6f31e136

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-win32.whl.

File metadata

  • Download URL: phylo2vec-1.7.0-cp310-abi3-win32.whl
  • Upload date:
  • Size: 832.0 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.11.5

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 bfded97b2738d9126ead5b74b4c0da0769bf73eec8c98676838dc472913604d8
MD5 a8527414712b60c1b177e247a761cfeb
BLAKE2b-256 6eab8f2862ac63a73d48e04f78af5b0be4eee3d250b1c661dc4b91399b5b4afc

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1b0cc60bdd264c1f3ad47457b3179b9e9ea17132540f61a1542f6717636b42e3
MD5 8fb31f16bf7207dfd3ee047a61110200
BLAKE2b-256 1e10d34101a9f86213eb98a751f7281924ad3446a3bb3fd2bdafb8db73f8d443

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 0127de5ef7d1402b35f701cb84e7f01b99cb4cf96d3f826e10ade08eb96bc442
MD5 33c8ecd8a240519589ade2c2e9e76b26
BLAKE2b-256 5f0945be3983dceed02b52ebbcb6bae9270e5b487c8f4fef84aad3162014ef87

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 24db9d9bbe139a787c20ba9cd52ee6eade27a91fcfb86a1452503db17e66c66d
MD5 0bce380455d958c005f98365561bf2f5
BLAKE2b-256 bb5a8d798d5abac8c7399bc8d019885b288da9f47561858f9869fcd4c9013196

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 a4832ed166ea11620760b1bd7168a20021a5d069d05e3eead1c625550c0ba83c
MD5 c9919be0a252e17f762a87a4f2c698e3
BLAKE2b-256 f3b807a930a7783ed1c6b3ebacb7aab0d56effacbc96fbbf7badeea1698fa857

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c61d6b9729c39d71275f2abbd6bb8951d37fde224566d3192536253cdb80d202
MD5 236af64c4d76deeb81da6e31a6b48465
BLAKE2b-256 93dbb9e7d5295cf12f88c7ae1ed738ce82a584d0e8911a1c9bb6f0a2afd43567

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 20245b8110983649aed44664f7302b95f306d6d987f1c34109c5ef9aecca567e
MD5 280baa24c0b2fcdc13da7822cc36fc07
BLAKE2b-256 b25a75c6fbf6c05ca7368cb734387da3f44e1f3426d96e3c16617d68069351b4

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 a2b6a5dfb75c8c26f3e3b0bb8a121b171edf177961b69aeb589e5b7277c6e986
MD5 bc6d12fee27a6f0ba3fe07639ba402e0
BLAKE2b-256 be255911ffe6220457ca2a16791e704d8f360c53322cd247274706e68e72c068

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 9dec121aae1c86e4ab17626e2961c62801b2f1c81e14b320882e7095ddbbf853
MD5 451d917893b35363b14b4dace9884802
BLAKE2b-256 70020c311f94c1db68ed54a63a516f6f685261c6dfc2a9dab1588e102716ffc2

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a532d2c0f0b51cd30c9d40b5c631f30e6aef7cba54fd1b9141b1032806157a3b
MD5 47ec84b59068323878d39908e794d121
BLAKE2b-256 5fa4ef2fd0707777e5386fabb56925d7da2d1d4089bf91a838be48c0fceb5a34

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 402ace6275224798207d0062e9add17ff9d961d15a4795075b4ad44adbb44c8d
MD5 fb46b7202ce8c20335ae6746cb8f3af6
BLAKE2b-256 b9c3ee24e7387c3eeaddf2afb60a54fe2764543dfc3c7828e727938c2588e50d

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e44d3f111362cb100a6d5596f78840a2bcb2f9776e8f51086f6cff3640c712e8
MD5 80faf694875e8d805a4ad95535183c6b
BLAKE2b-256 1a90e8bdfa65c888138add571f07f2731b352085123cf07f1d20b01ae6b91d82

See more details on using hashes here.

File details

Details for the file phylo2vec-1.7.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.7.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 bcdde895267f3afe37c8600c7a8eafe8fb0e475087117aada1cb7cc6b0e8b4b5
MD5 6633da0f8fb9fbb9f7c1b4122386036b
BLAKE2b-256 65eea8b1302eaf949fef5772e19244a0919bea80476ad7c9ee5b76fde6622fe3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page