Skip to main content

Autophylo is used to generate phylogentic trees automatically.

Project description

The Automatic Phylogentic Tree Builder

The Automatic Phylogentic (AutoPhylo) is used to generate phylogentic trees automatically by sampling the selected database (e.g., NCBI nr) and performs all tasks associated with traditional phylogentic tree building which includes trimming, dropping overly similar sequences, and generating an maximum likelihood (ML) tree using RAxML.

The user provides one protein sequence or multiple sequences and blast databases are used to sample sequences from phyla selected by the user e.g., The following can be used for phyla found in the human gut i.e., Actinobacteriota, Bacteroidota, Desulfobacterota, Firmicutes, Proteobacteria, Synergistota, Verrucomicrobiota, Fusobacteria.

Installation

The tool requires Python >= 3.11 and conda >= 4.12.0. The latest release can be installed directly from pip or this repository.

pip install autophylo

Or

Create a conda environment using the environment.yml file which installs all the dependencies (listed below).

conda env create -f environment.yml

Usage

conda activate autophylo

Install autophylo using tarball

Install the autophylo application within the created autophylo conda environment using a tarball.

python3 -m pip install /path/to/autophylo-1.0.0.tar.gz

Dependencies

The following are required dependencies (listed below):

  • NCBI BLAST 2.15.0
  • BLAST databases (version 5)
  • FastTree 2.1.11
  • MUSCLE 5.1.0
  • RAxML 8.2.13
  • GBLOCKS 0.91b
  • USEARCH 12.0_beta
  • seqkit 2.8.2
  • Trimal 1.5.0
  • biopython 1.84
  • joblib 1.4.2

Download pre-formatted blast databases

(https://ftp.ncbi.nlm.nih.gov/blast/db/v5/README)

  • The pre-formatted databases offer the following advantages:
    • Pre-formatting removes the need to run makeblastdb;
    • Species-level taxonomy ids are included for each database entry;
    • Databases are broken into smaller-sized volumes and are therefore easier to download;
    • Sequences in FASTA format can be generated from the pre-formatted databases by using the blastdbcmd utility;
    • A convenient script (update_blastdb.pl) is available in the blast+ package to download the pre-formatted databases.

download nr

update_blastdb.pl --source ncbi --decompress --blastdb_version 5  --verbose 2 --num_threads 30 nr > log.nr 2>&1

Update config file

Obtain path to dependencies programs using the $CONDA_PREFIX variable.

After activating the autophylo run the following command to get the path and use it to update the config file

(autophylo) echo $CONDA_PREFIX

Example config file

(autophylo) amos@Amogelangs-MacBook-Pro autophylo % echo $CONDA_PREFIX
/Users/amos/miniconda3/envs/autophylo

Updated config file

NOTE: The databases can be place anywhere in the filesystem and in this example they are in /Users/amos/datalake.

[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes

[ALIGNMENT]
MUSCLE=/Users/amos/miniconda3/envs/autophylo/bin/muscle
TRIMAL=/Users/amos/miniconda3/envs/autophylo/bin/trimal

[TREE]
FastTree=/Users/amos/miniconda3/envs/autophylo/bin/FastTree
RAxML_PTHREADS=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-PTHREADS-AVX
RAxML_HYBRID=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-HYBRID-AVX

[CLUSTERING]
GBLOCKS=/Users/amos/miniconda3/envs/autophylo/bin/Gblocks
USEARCH=/Users/amos/miniconda3/envs/autophylo/bin/usearch

[DATABASES]
BLASTDB=/Users/amos/datalake/BLASTDB/NR/nr
TAXIDS=/Users/amos/datalake/BLASTDB/NR/taxids

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autophylo-1.0.1.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

autophylo-1.0.1-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file autophylo-1.0.1.tar.gz.

File metadata

  • Download URL: autophylo-1.0.1.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for autophylo-1.0.1.tar.gz
Algorithm Hash digest
SHA256 b7e3fe025e52e8a1bb4c1aaa1e5a8aa5fb29b877a7f8619c5520d282ab3df215
MD5 bfd981b2d4140663c1507d927b4add8e
BLAKE2b-256 c5c4e71866616e3841ba190c35e25723e6b005bb1a1abda946a1cf0f5db9a7e9

See more details on using hashes here.

File details

Details for the file autophylo-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: autophylo-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for autophylo-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d8fb29e63c0106bbbead2f84b17afa63d6cfefe3d52932f94a82f809efd4a89b
MD5 9e19f1dbfe7962e0ac8124dd34f0546b
BLAKE2b-256 0dc4cddcece3c5f4c06e888261d4615765e0077038079c29e1251541467edcb4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page