Skip to main content

Phylo2Vec: integer vector representation of binary (phylogenetic) trees

Project description

Phylo2Vec

PyPI version Documentation DOI

LGPL-3.0 License

pre-commit.ci status CI Python CI Rust CI R

Phylo2Vec (or phylo2vec) is a high-performance software package for encoding, manipulating, and analysing binary phylogenetic trees. At its core, the package contains representation of binary trees, which defines a bijection from any tree topology with 𝑛 leaves into an integer vector of size 𝑛 − 1. Compared to the traditional Newick format, phylo2vec was designed with fast sampling, fast conversion/compression from Newick-format trees to the Phylo2Vec format, and rapid tree comparison in mind.

This current version features a core implementation in Rust, providing significant performance improvements and memory efficiency while remaining available in Python (superseding the version described in the original paper) and R via dedicated wrappers, making it accessible to a broad audience in the bioinformatics community.

Link to the paper: https://doi.org/10.1093/sysbio/syae030

Installation

Pip

The easiest way to install the standard Python package is using pip:

pip install phylo2vec

Several optimization schemes based on Phylo2Vec are also available, but require extra dependencies. (See this notebook for a demo). To avoid bloating the standard package, these dependencies must be installed separately. To do so, run:

pip install phylo2vec[opt]

Manual installation

  • We recommend setting up pixi package management tool.
  • Clone the repository and install using pixi:
git clone https://github.com/sbhattlab/phylo2vec.git
cd phylo2vec
pixi run -e py-phylo2vec install-python

This will compile and install the package as the core functionality is written in Rust.

Installing R package

Option 1: from a release (Windows, Mac, Ubuntu >= 22.04)

Retrieve one of the compiled binaries from the releases that fits your OS. Once the file is downloaded, simply run install.packages in your R command line.

install.packages("/path/to/package_file", repos = NULL, type = 'source')

Option 2: using devtools

⚠️ This requires installing Rust to build the core package.

devtools::install_github("sbhattlab/phylo2vec", subdir="./r-phylo2vec", build = FALSE)

Note: to download a specific version, use:

devtools::install_github("sbhattlab/phylo2vec@vX.Y.Z", subdir="./r-phylo2vec", build = FALSE)

Option 3: manual installation

⚠️ This requires installing Rust to build the core package.

Clone the repository and run the following install.packages in your R command line.

Note: to download a specific version, you can use git checkout to a desired tag.

git clone https://github.com/sbhattlab/phylo2vec
cd phylo2vec
install.packages("./r-phylo2vec", repos = NULL, type = 'source')

Basic Usage

Python

Conversion between Newick and vector representations

import numpy as np
from phylo2vec import from_newick, to_newick

# Convert a vector to Newick string
v = np.array([0, 1, 2, 3, 4])
newick = to_newick(v)  # '(0,(1,(2,(3,(4,5)6)7)8)9)10;'

# Convert Newick string back to vector
v_converted = from_newick(newick)  # array([0, 1, 2, 3, 4], dtype=int16)

Tree Manipulation

from phylo2vec.utils.vector import add_leaf, remove_leaf, reroot_at_random

# Add a leaf to an existing tree
v_new = add_leaf(v, 2)  # Add a leaf to the third position

# Remove a leaf
v_reduced = remove_leaf(v, 1)  # Remove the second leaf

# Random rerooting
v_rerooted = reroot_at_random(v)

Optimization

To run the hill climbing-based optimisation scheme presented in the original Phylo2Vec paper, run:

# A hill-climbing scheme to optimize Phylo2Vec vectors
from phylo2vec.opt import HillClimbing

hc = HillClimbing(verbose=True)
hc_result = hc.fit("/path/to/your_fasta_file.fa")

Command-line interface (CLI)

We also provide a command-line interface for quick experimentation on phylo2vec-derived objects.

To see the available functions, run:

phylo2vec --help

Examples:

phylo2vec samplev 5 # Sample a vector with 5 leaves
phylo2vec samplem 5 # Sample a matrix with 5 leaves
phylo2vec from_newick '((0,1),2);' # Convert a Newick to a vector
phylo2vec from_newick '((0:0.3,1:0.1):0.5,2:0.4);' # Convert a Newick to a matrix
phylo2vec to_newick 0,1,2 # Convert a vector to Newick
phylo2vec to_newick $'0.0,1.0,2.0\n0.0,3.0,4.0' # Convert a matrix to Newick

Documentation

For comprehensive documentation, tutorials, and API reference, visit: https://phylo2vec.readthedocs.io

How to Contribute

We welcome contributions to Phylo2Vec! Here's how you can help:

  1. Fork the repository and create your branch from main
  2. Make your changes and add tests if applicable
  3. Run the tests to ensure they pass
  4. Submit a pull request with a detailed description of your changes

Please make sure to follow our coding standards and write appropriate tests for new features.

Thanks to our contributors so far!

Contributors

License

This project is distributed under the GNU Lesser General Public License v3.0 (LGPL).

Citation

If you use Phylo2Vec in your research, please cite:

@article{10.1093/sysbio/syae030,
    author = {Penn, Matthew J and Scheidwasser, Neil and Khurana, Mark P and Duchêne, David A and Donnelly, Christl A and Bhatt, Samir},
    title = {Phylo2Vec: a vector representation for binary trees},
    journal = {Systematic Biology},
    year = {2024},
    month = {03},
    doi = {10.1093/sysbio/syae030},
    url = {https://doi.org/10.1093/sysbio/syae030},
}

Related Work

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phylo2vec-1.4.0.tar.gz (110.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

phylo2vec-1.4.0-cp310-abi3-win_amd64.whl (879.8 kB view details)

Uploaded CPython 3.10+Windows x86-64

phylo2vec-1.4.0-cp310-abi3-win32.whl (805.0 kB view details)

Uploaded CPython 3.10+Windows x86

phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_i686.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ i686

phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_armv7l.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARMv7l

phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ s390x

phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ppc64le

phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARMv7l

phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

phylo2vec-1.4.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.5+ i686

phylo2vec-1.4.0-cp310-abi3-macosx_11_0_arm64.whl (1.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

phylo2vec-1.4.0-cp310-abi3-macosx_10_12_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file phylo2vec-1.4.0.tar.gz.

File metadata

  • Download URL: phylo2vec-1.4.0.tar.gz
  • Upload date:
  • Size: 110.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.3

File hashes

Hashes for phylo2vec-1.4.0.tar.gz
Algorithm Hash digest
SHA256 c0342aa0abaf54f60eb109578366986cbdcd452fcbca8fcc7d22c19e3673ef16
MD5 f647b6fdaa07576234d607131c1a6596
BLAKE2b-256 2c1d3dd2dac9704498dfbeaca135bc53fca08892d87fc10f1203d988181e89d0

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 56aeeb442ee34cada88ea1a902ed228ec9aace1c7fe29231590984fbe2b3beaa
MD5 e2d610a77fda3efbb8d8ad36b2209ca8
BLAKE2b-256 a77ad770f1af9fbf875b01677f9af8e63de59d27e24e42671bad0c6ba4c95dd9

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-win32.whl.

File metadata

  • Download URL: phylo2vec-1.4.0-cp310-abi3-win32.whl
  • Upload date:
  • Size: 805.0 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.3

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 ed2ffb8d4602a5825d42e3807afcddb5e70705363285236b6eded0c38e5f4859
MD5 ec881b656d00a0a98f9170549632c6b0
BLAKE2b-256 ce7d6cdc1ebd34efffd556d4cde3c5de855f15a33b6246293437a3dadb58532d

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 5fabb63d0439ebb67bd8aef878b047c9cbcc11dd366766b5ee47980f4166ff59
MD5 f18b861c399b97e83aa427d5c0470dc3
BLAKE2b-256 7da1112e8674989d6c202f241189f40de18f5cd2e01d19642d221b10289b7e5e

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 9797ce87d2fcadb00fa00d997b6ccf4926f6285cc6874d18f44a9f2193dcc054
MD5 d3fdf4145cc763d03c9cdf83306fdd17
BLAKE2b-256 aa5a6f56adc18f3aee9d3c7cf593d38d98a44cd669baed1724b4079a39ec8342

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 bb5c2a300f27ac64d1ed5be9ba52f451cfd6285c5b62e3ddc34d0cb5f9399668
MD5 e6ce5fe9be17fca227f689207a35878c
BLAKE2b-256 448a32e57f54466df28823c4d809cef5fdd309b42b850e40135bb133f0819392

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 27ca6fff0024cf6eee50cfb83ff34be715406cbb702ab47410ad3371919e6451
MD5 2003bb0a0e2ca876aa209771f2b611c4
BLAKE2b-256 2c312bf89ec86ff2ed5a30746ce5c71fbae906b666726d08f2a31d642b852a80

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 82bf92af752a8c3214857a6c9a723e9203b0eb634185216a12db881269bbec89
MD5 59b5eab36f063d697602a47d949e0716
BLAKE2b-256 f76799f0c80d38833fb325b56343369a8f2e49e6d11abcd6966614812ccedfe7

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 f20f722fc0b20c739f531d58490ae93f2a45406d89a84d48d001bc6c88756477
MD5 befad54f18203922b220f6a58e8b2be9
BLAKE2b-256 d9039a3d394d5202bc5d8ad1664e0dc2d041bf0e5b74fb8ead1ac179991aba95

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 c1dfb161cadecc42e7c0dea9dc94fa5a9a82e36ea332de05ff21e1d045486ea0
MD5 a63b9e427942cdbf6bb6764872d50885
BLAKE2b-256 7c4cf3aa2080d78cd6ad5b64cac99f277c5cadf3d889cb231ee0adddeb20cff9

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 74a2a41353289ba8490c9b92b1ea5e9f8660fdf5013ee6c76c2399ec807c04e0
MD5 4d50f7c147b31928f46f4ab52827148c
BLAKE2b-256 89e265530ccbe109bfdc959df9d4e66898f947e90e267082888d63d47f3eb5b4

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1ed1098a8a5979434220ecc20fd4e554ae0ee895ba68f9eb6983146abc3dcc98
MD5 219cd3f864453fe08e4350941d01a10a
BLAKE2b-256 4401cbfb3ecac911feb3f0321d0d75d5d3315daf46df1fede657fac9e8afd086

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 b35db80ffde530499f3939b793780afa3623f4f3819a261dd9556fbde038f4c5
MD5 5f6b6d890eda05de8c86ab4a93db68a1
BLAKE2b-256 eecd289ce2f11d0e68144822881b86b229c4f248624579844275c457a4fbe4d9

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 60cd4d3bc28fac2681beff7b31d3d5cafc28b9d217e21eb47e6da9786c155d72
MD5 db45b2d7e7be6f612e36f1c88e727ba6
BLAKE2b-256 e1a1bf1cf1fa22e912afd37679e7905993f597f18a44c4c4ba8a4641af3ddbd7

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 0ee7a10e20ddc6bac4ea6fbca1c062cffd9f0c421092b3cd4795cb25115af123
MD5 2f3bff32607df60c63669c2bf0e10b85
BLAKE2b-256 4dc1762f838e0f8cd3041a1b1c2d891e3665d692cb492f0b34a01970d54c6b9f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page