Skip to main content

Phylo2Vec: integer vector representation of binary (phylogenetic) trees

Project description

Phylo2Vec

PyPI version Documentation DOI

LGPL-3.0 License

pre-commit.ci status CI Python CI Rust CI R

Phylo2Vec (or phylo2vec) is a high-performance software package for encoding, manipulating, and analysing binary phylogenetic trees. At its core, the package contains representation of binary trees, which defines a bijection from any tree topology with 𝑛 leaves into an integer vector of size 𝑛 − 1. Compared to the traditional Newick format, phylo2vec was designed with fast sampling, fast conversion/compression from Newick-format trees to the Phylo2Vec format, and rapid tree comparison in mind.

This current version features a core implementation in Rust, providing significant performance improvements and memory efficiency while remaining available in Python (superseding the version described in the original paper) and R via dedicated wrappers, making it accessible to a broad audience in the bioinformatics community.

Link to the paper: https://doi.org/10.1093/sysbio/syae030

Installation

Python package

Pip

The easiest way to install the standard Python package is using pip:

pip install phylo2vec

Several optimization schemes based on Phylo2Vec are also available, but require extra dependencies. (See this notebook for a demo). To avoid bloating the standard package, these dependencies must be installed separately. To do so, run:

pip install "phylo2vec[opt]"

Manual installation

  • We recommend setting up pixi package management tool.
  • Clone the repository and install using pixi:
git clone https://github.com/sbhattlab/phylo2vec.git
cd phylo2vec
pixi run -e py-phylo2vec install-python

This will compile and install the package as the core functionality is written in Rust.

Installing R package

Option 1: from a release (Windows, Mac, Ubuntu >= 22.04)

Retrieve one of the compiled binaries from the releases that fits your OS. Once the file is downloaded, simply run install.packages in your R command line.

install.packages("/path/to/package_file", repos = NULL, type = 'source')

Option 2: using devtools

⚠️ This requires installing Rust to build the core package.

devtools::install_github("sbhattlab/phylo2vec", subdir="./r-phylo2vec", build = FALSE)

Note: to download a specific version, use:

devtools::install_github("sbhattlab/phylo2vec@vX.Y.Z", subdir="./r-phylo2vec", build = FALSE)

Option 3: manual installation

⚠️ This requires installing Rust to build the core package.

Clone the repository and run the following install.packages in your R command line.

Note: to download a specific version, you can use git checkout to a desired tag.

git clone https://github.com/sbhattlab/phylo2vec
cd phylo2vec
install.packages("./r-phylo2vec", repos = NULL, type = 'source')

Basic Usage

Python

Conversion between Newick and vector representations

import numpy as np
from phylo2vec import from_newick, to_newick

# Convert a vector to Newick string
v = np.array([0, 1, 2, 3, 4])
newick = to_newick(v)  # '(0,(1,(2,(3,(4,5)6)7)8)9)10;'

# Convert Newick string back to vector
v_converted = from_newick(newick)  # array([0, 1, 2, 3, 4], dtype=int16)

Tree Manipulation

from phylo2vec.utils.vector import add_leaf, remove_leaf, reroot_at_random

# Add a leaf to an existing tree
v_new = add_leaf(v, 2)  # Add a leaf to the third position

# Remove a leaf
v_reduced = remove_leaf(v, 1)  # Remove the second leaf

# Random rerooting
v_rerooted = reroot_at_random(v)

Optimization

To run the hill climbing-based optimisation scheme presented in the original Phylo2Vec paper, run:

# A hill-climbing scheme to optimize Phylo2Vec vectors
from phylo2vec.opt import HillClimbing

hc = HillClimbing(verbose=True)
hc_result = hc.fit("/path/to/your_fasta_file.fa")

Command-line interface (CLI)

We also provide a command-line interface for quick experimentation on phylo2vec-derived objects.

To see the available functions, run:

phylo2vec --help

Examples:

phylo2vec samplev 5 # Sample a vector with 5 leaves
phylo2vec samplem 5 # Sample a matrix with 5 leaves
phylo2vec from_newick '((0,1),2);' # Convert a Newick to a vector
phylo2vec from_newick '((0:0.3,1:0.1):0.5,2:0.4);' # Convert a Newick to a matrix
phylo2vec to_newick 0,1,2 # Convert a vector to Newick
phylo2vec to_newick $'0.0,1.0,2.0\n0.0,3.0,4.0' # Convert a matrix to Newick

Documentation

For comprehensive documentation, tutorials, and API reference, visit: https://phylo2vec.readthedocs.io

How to Contribute

We welcome contributions to phylo2vec! Please refer to our Contributing guidelines for more details how to report bugs, request features, or submit code improvements.

Thanks to all our contributors so far!

Contributors

License

This project is distributed under the GNU Lesser General Public License v3.0 (LGPL).

Citation

If you use Phylo2Vec in your research, please cite:

@article{10.1093/sysbio/syae030,
    author = {Penn, Matthew J and Scheidwasser, Neil and Khurana, Mark P and Duchêne, David A and Donnelly, Christl A and Bhatt, Samir},
    title = {Phylo2Vec: a vector representation for binary trees},
    journal = {Systematic Biology},
    year = {2024},
    month = {03},
    doi = {10.1093/sysbio/syae030},
    url = {https://doi.org/10.1093/sysbio/syae030},
}

Related Work

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phylo2vec-1.4.1.tar.gz (111.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

phylo2vec-1.4.1-cp310-abi3-win_amd64.whl (871.5 kB view details)

Uploaded CPython 3.10+Windows x86-64

phylo2vec-1.4.1-cp310-abi3-win32.whl (797.4 kB view details)

Uploaded CPython 3.10+Windows x86

phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_i686.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ i686

phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_armv7l.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARMv7l

phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ s390x

phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ppc64le

phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARMv7l

phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

phylo2vec-1.4.1-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.5+ i686

phylo2vec-1.4.1-cp310-abi3-macosx_11_0_arm64.whl (991.5 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

phylo2vec-1.4.1-cp310-abi3-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file phylo2vec-1.4.1.tar.gz.

File metadata

  • Download URL: phylo2vec-1.4.1.tar.gz
  • Upload date:
  • Size: 111.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for phylo2vec-1.4.1.tar.gz
Algorithm Hash digest
SHA256 47551c4ae22ef62ac716e003c34abce5b5ba4d45386997cb2c5c31c57c84040b
MD5 825a492d6ec06fbc72459dd65950df09
BLAKE2b-256 e415d5154ded94682276be30da1a3fc403f80a5d2f58f8b4a3e7ab9512d45561

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: phylo2vec-1.4.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 871.5 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 61fa3cb46785700a14d7277d5dcdafed1dd11fa758f94194c39feb0573b45d78
MD5 eb33ac9c96576098dea684dabdb9d957
BLAKE2b-256 31fa4fc18c654a6e69ac46d68220ee9e66efc890e0d6fd4b73d5a0144d8a011f

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-win32.whl.

File metadata

  • Download URL: phylo2vec-1.4.1-cp310-abi3-win32.whl
  • Upload date:
  • Size: 797.4 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 6194e0637f2c3beebea2fdc7da46c2ee1163c440ba856b1e62401a62b2c9a148
MD5 c96ae62ed0ef32c66416be02c15f5509
BLAKE2b-256 96579854088339d5af1a10de42b12848584fbf1997c913b4b125e984f38af0ce

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 9a5f53bff05954b8311b52abc746ff1c1d17bcb6222bea8c9d0d36b9903a8a6a
MD5 55622a8ba33eee1da4ee1155c99ad4e4
BLAKE2b-256 1df4c40171012eeb6ef99b7489f8c04311ecf4d1791323375e17b8e0058c7ee0

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 e339399e349d441c27322f89f212dc57b04ebda2a8d09b492955f52209b1a987
MD5 ac47990c86e8dee36e3598f10118a221
BLAKE2b-256 9a2e76524abeaca171b4a43a603a74251f7682916fcbc8e960d8b7cae40cccd0

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 3f14edeb6977940d539a276c67df80c032d0cf69c33dabfd4afec2e8f98a23aa
MD5 28010a314cfe6c8eb76dad1453de543f
BLAKE2b-256 84d44636165493d1a5daf4d764a418eb2c7d6af984653140e2d6930e1070f32f

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 704c8d1b75fe81b71130152538d1a3b2ba7075e616893705e2fbedc32ab19206
MD5 22e5f4ee059fff764e5885a5341a5f33
BLAKE2b-256 1ec4aa4abee778feb96069c3183686442e6c1ee6881a1df61d450bd76589a554

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4cd43939160212ab1e689a1443e3e394e3d9ac0f95fdd5145abf7c5e61dd46aa
MD5 0772fd2dc86237b0d3b7254809c9e1d4
BLAKE2b-256 75882f1194013404dbec389e9095a4ced7f0b746e1f157d84753edc723bc6236

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 0f78e726e0331e7a7984f2f063399fb1c0e4173876d6ee6684e7c9a3cf4bbe0b
MD5 f8ed4bb90850dcff9116a97c1d7c4fc8
BLAKE2b-256 49e97b67af428ba5c4c6a82d99e467428d7685179413e5194bb144b3f7ebacc3

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 d5749267135e8b778973892d8a43d2afc60c3dcce4c549349642829d9528295c
MD5 4cd9f0cf0c5b52a7fe0634c0159bcdd7
BLAKE2b-256 3fa2e1a0bd6980ac7eff4297fe9f1215dfd11027ec430329b0b29d55efc8e26e

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 105433aebabdde676ce207d12cdcf785123bd6bcf4ca4dee46608175392d933d
MD5 9e5f225156cd8411475b5ee76c5e8303
BLAKE2b-256 13fa2ca0b31353f9b186e38a8a94a719447f59f6d174b6909778a3bb9db5b399

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f6b561409dfd246936428da2c4d29f33923347671113469c50bc4ac6af242116
MD5 8e526383bca772fef18628086a4b7092
BLAKE2b-256 e9d4cb57110b8146470f33619257ec01a0ba9be8a86e82358496cc0b98dd76f7

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 2a26b9e581f0fd7cb8d3f8e009758f0dbd5bf109e5d57eeab088c1aa6567d536
MD5 50d92794c232e4a31236568ba11cab36
BLAKE2b-256 5b1986db9ff83ec5c82b654d640e0a339bbe587800f9035d3c46aa3e3368d7de

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2b2cfd173a6efd08f86ddd5a7516545d8995794efab42b40a5cf2996a62a0bba
MD5 bf841b08f9228d53851923217f2b181e
BLAKE2b-256 b0841548db63637c47cf6783b47f5d3e9808a2a99d1e50674f0b6426d221f2a5

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 27f1ca1444fc8e28fcb0f4595e430e1fe9cc408af4c466eabc010b0fcb102d67
MD5 291b8b5eacc658e58e5cead0a05f7445
BLAKE2b-256 80a14e18824dee1c3cc5ac053d4bc2f6bdf4ddc84b6f78a057e0ec35be747763

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page