Skip to main content

Phylo2Vec: integer vector representation of binary (phylogenetic) trees

Project description

Phylo2Vec

PyPI version Documentation DOI

LGPL-3.0 License

pre-commit.ci status CI Python CI Rust CI R

Phylo2Vec (or phylo2vec) is a high-performance software package for encoding, manipulating, and analysing binary phylogenetic trees. At its core, the package contains representation of binary trees, which defines a bijection from any tree topology with 𝑛 leaves into an integer vector of size 𝑛 − 1. Compared to the traditional Newick format, phylo2vec was designed with fast sampling, fast conversion/compression from Newick-format trees to the Phylo2Vec format, and rapid tree comparison in mind.

This current version features a core implementation in Rust, providing significant performance improvements and memory efficiency while remaining available in Python (superseding the version described in the original paper) and R via dedicated wrappers, making it accessible to a broad audience in the bioinformatics community.

Link to the paper: https://doi.org/10.1093/sysbio/syae030

Installation

Pip

The easiest way to install the Python package is using pip:

pip install phylo2vec

Manual installation

  • We recommend setting up pixi package management tool.
  • Clone the repository and install using pixi:
git clone https://github.com/sbhattlab/phylo2vec.git
cd phylo2vec
pixi run -e py-phylo2vec install-python

This will compile and install the package as the core functionality is written in Rust.

Installing R package

Option 1: from a release (Windows, Mac, Ubuntu >= 22.04)

Retrieve one of the compiled binaries from the releases that fits your OS. Once the file is downloaded, simply run install.packages in your R command line.

install.packages("/path/to/package_file", repos = NULL, type = 'source')

Option 2: using devtools

⚠️ This requires installing Rust to build the core package.

devtools::install_github("sbhattlab/phylo2vec", subdir="./r-phylo2vec", build = FALSE)

Note: to download a specific version, use:

devtools::install_github("sbhattlab/phylo2vec@vX.Y.Z", subdir="./r-phylo2vec", build = FALSE)

Option 3: manual installation

⚠️ This requires installing Rust to build the core package.

Clone the repository and run the following install.packages in your R command line.

Note: to download a specific version, you can use git checkout to a desired tag.

git clone https://github.com/sbhattlab/phylo2vec
cd phylo2vec
install.packages("./r-phylo2vec", repos = NULL, type = 'source')

Basic Usage

Python

Conversion between Newick and vector representations

import numpy as np
from phylo2vec import from_newick, to_newick

# Convert a vector to Newick string
v = np.array([0, 1, 2, 3, 4])
newick = to_newick(v)  # '(0,(1,(2,(3,(4,5)6)7)8)9)10;'

# Convert Newick string back to vector
v_converted = from_newick(newick)  # array([0, 1, 2, 3, 4], dtype=int16)

Tree Manipulation

from phylo2vec.utils.vector import add_leaf, remove_leaf, reroot_at_random

# Add a leaf to an existing tree
v_new = add_leaf(v, 2)  # Add a leaf to the third position

# Remove a leaf
v_reduced = remove_leaf(v, 1)  # Remove the second leaf

# Random rerooting
v_rerooted = reroot_at_random(v)

Optimization

from phylo2vec.opt import HillClimbingOptimizer

# Perform phylogenetic inference
hc = HillClimbingOptimizer(raxml_cmd="/path/to/raxml-ng", verbose=True)
v_opt, taxa_dict, losses = hc.fit("/path/to/your_fasta_file.fa")

Documentation

For comprehensive documentation, tutorials, and API reference, visit: https://phylo2vec.readthedocs.io

How to Contribute

We welcome contributions to Phylo2Vec! Here's how you can help:

  1. Fork the repository and create your branch from main
  2. Make your changes and add tests if applicable
  3. Run the tests to ensure they pass
  4. Submit a pull request with a detailed description of your changes

Please make sure to follow our coding standards and write appropriate tests for new features.

Thanks to our contributors so far!

Contributors

License

This project is distributed under the GNU Lesser General Public License v3.0 (LGPL).

Citation

If you use Phylo2Vec in your research, please cite:

@article{10.1093/sysbio/syae030,
    author = {Penn, Matthew J and Scheidwasser, Neil and Khurana, Mark P and Duchêne, David A and Donnelly, Christl A and Bhatt, Samir},
    title = {Phylo2Vec: a vector representation for binary trees},
    journal = {Systematic Biology},
    year = {2024},
    month = {03},
    doi = {10.1093/sysbio/syae030},
    url = {https://doi.org/10.1093/sysbio/syae030},
}

Related Work

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phylo2vec-1.3.0.tar.gz (78.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

phylo2vec-1.3.0-cp310-abi3-win_amd64.whl (810.3 kB view details)

Uploaded CPython 3.10+Windows x86-64

phylo2vec-1.3.0-cp310-abi3-win32.whl (740.1 kB view details)

Uploaded CPython 3.10+Windows x86

phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_i686.whl (1.2 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ i686

phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_armv7l.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARMv7l

phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ s390x

phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ppc64le

phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARMv7l

phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

phylo2vec-1.3.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.5+ i686

phylo2vec-1.3.0-cp310-abi3-macosx_11_0_arm64.whl (931.8 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

phylo2vec-1.3.0-cp310-abi3-macosx_10_12_x86_64.whl (983.4 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file phylo2vec-1.3.0.tar.gz.

File metadata

  • Download URL: phylo2vec-1.3.0.tar.gz
  • Upload date:
  • Size: 78.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.8.6

File hashes

Hashes for phylo2vec-1.3.0.tar.gz
Algorithm Hash digest
SHA256 cc175d4ca0d7166ee869eb6dab37bf7298555f4f0a9bc4a5491eb62a6a493973
MD5 95073640ad279c4eb2e6169447a2d64e
BLAKE2b-256 dba3bdfe89b780c7e2587013e95906f0c0b96e626e8464b259c3f48034dbc007

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 482bad4f2e618e949af6db3d43fb5a8b56add2aeb2458af9dc10feef58f99cf8
MD5 9edd271368ef5dd7ccd1f75d68839c93
BLAKE2b-256 3c4f8270fe04dd2a4c71639f7959b0dbc87505ced0781b27fdcb0e1aa027a25a

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-win32.whl.

File metadata

  • Download URL: phylo2vec-1.3.0-cp310-abi3-win32.whl
  • Upload date:
  • Size: 740.1 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.8.6

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 16cbfa1b93e647e7d952c271f541c105b7559ab4649fced8edf7be5b3480f25e
MD5 d8e9be9cbb71a80a7833d1ab3b18d91d
BLAKE2b-256 4ab3e850d5241a4803bca977a6ce7faa0f3213e113c4b990307cffad1f12f946

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 4c9c938ad33451131e3cc54415dae1d3c0c42a31d0db6e84a85729bba2ab05fb
MD5 c4b6459ac88a9d8413926be87f813bfc
BLAKE2b-256 2d0008930a9070e14d39429fde9d00635c493abd21247e1e01412c6bc1360856

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 55352b2ce2ed38bd48c75863499bdc14ebdd05205490198a6e916c0b66306b35
MD5 9931a17c95fb86568b84581248181a9c
BLAKE2b-256 13666298959e653066831ac3f92d15e92beaf5b91217537973182a84c977fe5f

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 0365cde9aa13c64ceef7989bdd55fbf2d63e1732c4d708fae6251f5d2d884b6f
MD5 bebcd70be687f3058331099b13b88f83
BLAKE2b-256 94b54a353fa8d58d6a86a2f02d5313d6b860592f1a39c630363b18731e46bc43

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 eb84763a526223b41adc0de6f61f9b8093e461ad1f15a1a9560e9404ee283345
MD5 12b59626027d79d52111a0e2a034eab9
BLAKE2b-256 160f249ea2dec43014a74727dee79e18bbc526b33232a6a07853bd58f861b67e

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e442a32977080ff6ce7e4e6d08d5e5d6eab0512295d679f21b8879f7a1b29d28
MD5 cef210a5db7bc34a4c8189238f7651ae
BLAKE2b-256 9a90ef44fad6ec9998e72907f0b9c43f49e1d208d038d30de5fdda578cae22f0

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 28a0814106512f6aaf0455ce46a2dae526da5ca1bc16e65baeb898dec65f035b
MD5 0bf223b1dda0e85b70466139c5655809
BLAKE2b-256 aee4a763b1c2247304467090447959e304f92dd36e591460d3d8886656eaf2b4

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 9548ebce22474e53c159d1764f5225ffd4e985679814941a95cdfeaddf2ea0a9
MD5 6808f056ed5b3e3316d7377c59d4383a
BLAKE2b-256 68512d05fc5d5ea2299e700bb86852ff746c5bf358467643026416b62abd1b95

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 ca3ac250ece301560ff76ac48aa41453ba8ac2127ba8f9b24d60caf713597ec4
MD5 d50dbed763ab9583526c4c15b44608cb
BLAKE2b-256 ccc37de8befd98980095501c123a2fb23db8f5c2974287b7a626f849d16882ab

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b49f5e973881c28ae284714010f20b3603804dcff1904dbeace8b6386d220f27
MD5 9c365b29619ed99564100c82a722b3ae
BLAKE2b-256 2e31b888cb43a026a25c4e60199ecfb7b650a5d47bb52aa37862c0eb0be22922

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 02966b615d550bf8f9673c55f828718d80b09b53c159f93b83f4ddbbfcedfe3c
MD5 16874b347e932b57e9fb1ef6d3f7f09e
BLAKE2b-256 a5e2eba10991409773afa806a702f03670018bea100fd80563cfa5e4659c2700

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c24fb5e92a1cab0ae558dcf9357911135ceaf14bb060adaaa53ccbff1136e416
MD5 38cb62b14190ed71a27b08fe3ae96ff0
BLAKE2b-256 4ebcf5f30ed26c6f9b2585c4b4a283ee092c75ce119cf00160a936a756be03bf

See more details on using hashes here.

File details

Details for the file phylo2vec-1.3.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for phylo2vec-1.3.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b220fffed91e832d697da076a9b78b27417320f7947c966fef88118651c7b08a
MD5 96124dd1dae501d1c6e157c627ab198b
BLAKE2b-256 690002985243ea969688862877454de419bbdc0015645ba1ef764e691c7d020e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page