Skip to main content

a python package for automated generation of heuristic phylogenetic trees from genbank files

Project description

getphylo logo

getphylo: GEnbank To PHYLOgeny

a python package for automated generation of heuristic phylogenetic trees from genbank files

Description

getphylo was designed to automatically build multi-locus phylogenetic trees from GenBank files. The workflow consists of the following steps: i) extract protein coding sequences; ii) screen for suitable markers; iii) align individual marker sequences and create a combined alignment; and iv) produce a tree from the combined alignment. Please see the 'parameters' section below for a full list of parameters.

Installation

The easiest way to install getphylo is using the command:

pip install getphylo

This will fetch and install the latest version from: https://pypi.org/project/getphylo/

For full installation instructions, please see the getphylo wiki.

Important: getphylo requires DIAMOND(>=2.0.14.152), MUSCLE(>=3.8.1551) and FastTree2(>=2.1.11) to be installed to work correctly. These must be installed manually. Further instructions are availiable on the wiki.

Quick-start

This package has been designed to be as easy to run as possible. Simply navigate to a working directory containing .gbk files and input:

getphylo

This will run the software with default settings.

A full list of options and flags can be viewed with:

getphylo -h

A full list of parameters and further usage examples are availiable on the wiki.

Example Analysis and Datasets

Example outputs and benchmarking data can be found in the getphylo benchmarking repository. The example data includes:

  1. A phylogeny of bacterial genomes,
  2. A phylogeny of a biosynthetic gene cluster,
  3. A phylogeny of primate genomes,
  4. A phylogeny of Eurotiomycete fungi.

Citation

If you use getphylo, please cite:

Booth, T.J., Shaw, S., Cruz-Morales, P. and Weber, T. getphylo: rapid and automatic generation of multi-locus phylogenetic trees. BMC Bioinformatics 26, 21 (2025). DOI: https://doi.org/10.1186/s12859-025-06035-1

Patch Notes

Version 1

  • 1.0.2
    • added execultable checks to break early if executables are not defined correctly
    • updated exacutable error message to be informative about which executable is missing
    • getphylo now writes a copy of the log to getphylo.log
    • added missing '.csv' extension to thresholding_data file
    • fixed typos in the parser
  • 1.0.1
    • fixed the functionality of -ir/--ignore-bad-records, it will now skip records in the analysis that contain poorly formatted locus tags
    • -ir/--ignore-bad-records now only works if used in tandem with -ia/--ignore-bad-annotations, help text updated to reflect this
  • 1.0.0
    • full release for Booth et al. BMC Bioinformatics 26, 21 (2025)
    • fixed handling CDSs with empty translations

Version 0

  • 0.3.2
    • added version info to setup.py and README for none-python dependencies
    • now raises error if the user provides too few input files
    • now raises an error if an invalid phylogentic method is provided (e.g. not fasttree or iqtree)
  • 0.3.1
    • fixed issue with the query and subject cover in diamond
  • 0.3.0
    • now supports modifying blastp thresholds, including parameters for identity and coverage
    • fixed typos in parser
    • fixed crashing when provided with directories with spaces in the names
  • 0.2.2
    • added error message when users attempt to input directory instead of a search string
  • 0.2.1
    • now able to provide custom paths for binary dependencies
    • parser now has argument groups and is more readable
    • file exists error message more informative
  • 0.2.0
    • now supports iqtree using the --method parameter
  • 0.1.2
    • now raises an error if translations are present but empty
    • error messages from the extract module are now more informative
    • fixed a fatal issue with --build-all
  • 0.1.1
    • added support for MUSCLE5
  • 0.1.0
    • beta version initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

getphylo-1.0.2.tar.gz (32.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

getphylo-1.0.2-py3-none-any.whl (36.8 kB view details)

Uploaded Python 3

File details

Details for the file getphylo-1.0.2.tar.gz.

File metadata

  • Download URL: getphylo-1.0.2.tar.gz
  • Upload date:
  • Size: 32.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for getphylo-1.0.2.tar.gz
Algorithm Hash digest
SHA256 8ac8c87ee5a57179888a2b3b09f39842685cb593501b68f818d71b6edad0f067
MD5 cd63dcce56f90a3c617a56a0c932ec64
BLAKE2b-256 64ef2c9b21e4f60416a2cf9e03733aa3eaf1597e10ad752209677024ff2fe174

See more details on using hashes here.

File details

Details for the file getphylo-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: getphylo-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 36.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for getphylo-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 409d6e5350d265c65126137e0e2d6c7a4c8ba73e88592b3087d387f8a472d72a
MD5 17ed5b4871d667ce68ba3edcecec23b4
BLAKE2b-256 9905b684fdbc426dde337961ae6d93eba28cbb58182abdf5402ca3e5f13f823e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page