Skip to main content

Phylo2Vec: integer vector representation of binary (phylogenetic) trees

Project description

Phylo2Vec

PyPI version Documentation DOI

LGPL-3.0 License

pre-commit.ci status CI Python CI Rust CI R

Phylo2Vec (or phylo2vec) is a high-performance software package for encoding, manipulating, and analysing binary phylogenetic trees. At its core, the package contains representation of binary trees, which defines a bijection from any tree topology with 𝑛 leaves into an integer vector of size 𝑛 − 1. Compared to the traditional Newick format, phylo2vec was designed with fast sampling, fast conversion/compression from Newick-format trees to the Phylo2Vec format, and rapid tree comparison in mind.

This current version features a core implementation in Rust, providing significant performance improvements and memory efficiency while remaining available in Python (superseding the version described in the original paper) and R via dedicated wrappers, making it accessible to a broad audience in the bioinformatics community.

Link to the paper: https://doi.org/10.1093/sysbio/syae030

Installation

Python package

Pip

The easiest way to install the standard Python package is using pip:

pip install phylo2vec

Several optimization schemes based on Phylo2Vec are also available, but require extra dependencies. (See this notebook for a demo). To avoid bloating the standard package, these dependencies must be installed separately. To do so, run:

pip install "phylo2vec[opt]"

Manual installation

  • We recommend setting up pixi package management tool.
  • Clone the repository and install using pixi:
git clone https://github.com/sbhattlab/phylo2vec.git
cd phylo2vec
pixi run -e py-phylo2vec install-python

This will compile and install the package as the core functionality is written in Rust.

Installing R package

Option 1: from a release (Windows, Mac, Ubuntu >= 22.04)

Retrieve one of the compiled binaries from the releases that fits your OS. Once the file is downloaded, simply run install.packages in your R command line.

install.packages("/path/to/package_file", repos = NULL, type = 'source')

Option 2: using devtools

⚠️ This requires installing Rust to build the core package.

devtools::install_github("sbhattlab/phylo2vec", subdir="./r-phylo2vec", build = FALSE)

Note: to download a specific version, use:

devtools::install_github("sbhattlab/phylo2vec@vX.Y.Z", subdir="./r-phylo2vec", build = FALSE)

Option 3: manual installation

⚠️ This requires installing Rust to build the core package.

Clone the repository and run the following install.packages in your R command line.

Note: to download a specific version, you can use git checkout to a desired tag.

git clone https://github.com/sbhattlab/phylo2vec
cd phylo2vec
install.packages("./r-phylo2vec", repos = NULL, type = 'source')

Basic Usage

Python

Conversion between Newick and vector representations

import numpy as np
from phylo2vec import from_newick, to_newick

# Convert a vector to Newick string
v = np.array([0, 1, 2, 3, 4])
newick = to_newick(v)  # '(0,(1,(2,(3,(4,5)6)7)8)9)10;'

# Convert Newick string back to vector
v_converted = from_newick(newick)  # array([0, 1, 2, 3, 4], dtype=int16)

Tree Manipulation

from phylo2vec.utils.vector import add_leaf, remove_leaf, reroot_at_random

# Add a leaf to an existing tree
v_new = add_leaf(v, 2)  # Add a leaf to the third position

# Remove a leaf
v_reduced = remove_leaf(v, 1)  # Remove the second leaf

# Random rerooting
v_rerooted = reroot_at_random(v)

Optimization

To run the hill climbing-based optimisation scheme presented in the original Phylo2Vec paper, run:

# A hill-climbing scheme to optimize Phylo2Vec vectors
from phylo2vec.opt import HillClimbing

hc = HillClimbing(verbose=True)
hc_result = hc.fit("/path/to/your_fasta_file.fa")

Command-line interface (CLI)

We also provide a command-line interface for quick experimentation on phylo2vec-derived objects.

To see the available functions, run:

phylo2vec --help

Examples:

phylo2vec samplev 5 # Sample a vector with 5 leaves
phylo2vec samplem 5 # Sample a matrix with 5 leaves
phylo2vec from_newick '((0,1),2);' # Convert a Newick to a vector
phylo2vec from_newick '((0:0.3,1:0.1):0.5,2:0.4);' # Convert a Newick to a matrix
phylo2vec to_newick 0,1,2 # Convert a vector to Newick
phylo2vec to_newick $'0.0,1.0,2.0\n0.0,3.0,4.0' # Convert a matrix to Newick

Documentation

For comprehensive documentation, tutorials, and API reference, visit: https://phylo2vec.readthedocs.io

How to Contribute

We welcome contributions to phylo2vec! Please refer to our Contributing guidelines for more details how to report bugs, request features, or submit code improvements.

Thanks to all our contributors so far!

Contributors

License

This project is distributed under the GNU Lesser General Public License v3.0 (LGPL).

Citation

If you use Phylo2Vec in your research, please cite:

@article{10.1093/sysbio/syae030,
    author = {Penn, Matthew J and Scheidwasser, Neil and Khurana, Mark P and Duchêne, David A and Donnelly, Christl A and Bhatt, Samir},
    title = {Phylo2Vec: a vector representation for binary trees},
    journal = {Systematic Biology},
    year = {2024},
    month = {03},
    doi = {10.1093/sysbio/syae030},
    url = {https://doi.org/10.1093/sysbio/syae030},
}

Related Work

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phylo2vec-1.4.1rc1.tar.gz (111.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

phylo2vec-1.4.1rc1-cp310-abi3-win_amd64.whl (871.5 kB view details)

Uploaded CPython 3.10+Windows x86-64

phylo2vec-1.4.1rc1-cp310-abi3-win32.whl (797.5 kB view details)

Uploaded CPython 3.10+Windows x86

phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_i686.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ i686

phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_armv7l.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARMv7l

phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ s390x

phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ppc64le

phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARMv7l

phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.5+ i686

phylo2vec-1.4.1rc1-cp310-abi3-macosx_11_0_arm64.whl (991.6 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

phylo2vec-1.4.1rc1-cp310-abi3-macosx_10_12_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file phylo2vec-1.4.1rc1.tar.gz.

File metadata

  • Download URL: phylo2vec-1.4.1rc1.tar.gz
  • Upload date:
  • Size: 111.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for phylo2vec-1.4.1rc1.tar.gz
Algorithm Hash digest
SHA256 375b2c6780bc178c32f7572674c0a97535f3fd0f58ed7506e494579417d197f3
MD5 3b12ee80227c97bc18f999e5564f60e5
BLAKE2b-256 081f5d071c7f076646fb4e051ee15bbe071c3a7c91f428ad305444577760627e

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 805e78bd30e29b78315ff95ac5b3e9f1012ff66d52dc1be0ea8eb68ac9a2cad3
MD5 87ecbf3abbd5f39dee6c99b065931044
BLAKE2b-256 b043d1799bb5c5257ee36164c1b7129b0130f2d873e535f68bb6b1846f9281c9

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-win32.whl.

File metadata

  • Download URL: phylo2vec-1.4.1rc1-cp310-abi3-win32.whl
  • Upload date:
  • Size: 797.5 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 a554f45c205c32502aa2bc1e3e1d4deb063e659dc6e39aa81a2925738e9cce25
MD5 37acc286ecc16a04ebf314369072c88f
BLAKE2b-256 70fecf5b5896af278417c8cbef8abb3124ba3d81526cb10c49ec6da5d3cf3893

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 c71a1071153a12b4f13d5a4cb1ad9c81ad465ccd2162c7aabdfccd008106f6cd
MD5 b3b4172401827ea9c5d842d5516039dc
BLAKE2b-256 ddb8d0010dc77f03ee4dcabfd94744090e8f130fad9f5d17227ae6a9c3760307

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 e166bd0da5a113b9565672a51919706197c9708e8864f46f802d87d59e52dd49
MD5 3d68b8a04209f3a7a042c4715c72368a
BLAKE2b-256 1f70ff804ac7c56fa89844813dd5fcb60e95bd2f457618ea290a4a0b9309849c

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 fbd2b308c779f0f82848a46e9f827d4e4109e2228b4f8afa176fac941ae72d5a
MD5 08dafe2b401cb7d17dd73baefec3b059
BLAKE2b-256 548aae8690f0e806004f7666fb5de101f110bf05d7365c614a32f9cc9cac2387

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 52ffa8e7d17025739a439c9a69e1d55ba6fb2d6a0721fcbc02ddd4de7ececde2
MD5 11e8e38f61ad449e3d49c76ad1cecc9d
BLAKE2b-256 e018210b2e1c74030af0be44649a7dcc5c4cfe31b80f283e84d3868d79a4b40b

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a7ca698bb5f160ab27ce3b0e813348569e27701ea2d6357fa8bdecc671431f2b
MD5 e433e07edccd4deb1c0151627959c310
BLAKE2b-256 025bc005a95333c32b3bae9950a414062259cd37b8961679b289ffd90b6fe2b5

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 5b124b9a5ef23262a51f2fe6c533feebc2485008d4297ac4285eba0108de694a
MD5 01f36bf0f78499e6ec6b9d86b32740c6
BLAKE2b-256 dfb116afeee44cbd5df5bb2fbc767bb6dc72bf50c2708fe4ddeec946b66575ec

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 93dcbf8f8fff4330cc08ca9d9c714532a2da53704740bffad487b91df8d472a9
MD5 f30b3eb9f32fe423b7a20f407cae7b5e
BLAKE2b-256 be061cfee6d3928ff5f4a232b5c3f0647a0105f7b3eeaf70fbc6814664e782b0

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 cdfca89284c415fc93399f15d2f3ca2f9161508c272139385a8b4d317eda9c5f
MD5 5d945aadc7c3acfccb5a06578e80f8e8
BLAKE2b-256 2ee78452b87044083fc964ea390bcb360084e129ba1ce7d5acdb3bba97c07066

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 423041de9074025ab404c5b924c9d349922b3d396ed8f9daf606fffac3dbd105
MD5 bd737bcd470a32e78826c3e13e52ee52
BLAKE2b-256 7c7075daf63564869768a159cf6486fc54890d47f24c925cf601ff666bd6a2c8

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 9f2f6e1d3c35f50f62efaa515f3ffe32484b307e0c280868047ab964a7860c4a
MD5 744aae6a5a8f0b8c39313457b292fe0c
BLAKE2b-256 d1b46f0a3818ec11fb509300af3d84df11cb6e28e675586a5d3cae0e30350a3f

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b5a2c2bc459e2c20cc67d8860a42c481a7321fd66f1495def49ca75cda83bc4b
MD5 eae3042fa04268f779d81fab9a4015ad
BLAKE2b-256 aa2a4652e8ef17504545d1cb2b70323a53ca95c087f2135445515ff29b509204

See more details on using hashes here.

File details

Details for the file phylo2vec-1.4.1rc1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.4.1rc1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 af6c0c3ee901df5dba515ab839181883ff5ffb40b9a9ac09b191030bd287d76b
MD5 f6ea7ab6fbc04ef9ae0b4c4d695ef653
BLAKE2b-256 4bfb5ceb67f54ed56cb834640725daf1f3d46096683fc4ff39da2ac63f73e1bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page