Skip to main content

Autophylo is used to generate phylogentic trees automatically.

Project description

The Automatic Phylogenetic Tree Builder

The Automatic Phylogenetic (AutoPhylo) is used to generate phylogenetic trees automatically by sampling the selected database (e.g., NCBI nr) and performs all tasks associated with traditional phylogenetic tree building which includes trimming, dropping overly similar sequences, and generating an maximum likelihood (ML) tree using RAxML.

The user provides one protein sequence or multiple sequences and blast databases are used to sample sequences from phyla selected by the user e.g., The following can be used for phyla found in the human gut i.e., Actinobacteriota, Bacteroidota, Desulfobacterota, Firmicutes, Proteobacteria, Synergistota, Verrucomicrobiota, Fusobacteria.

autophylo overview

Installation

The tool requires Python >= 3.11 and conda >= 4.12.0. The latest release can be installed directly from pip or this repository.

pip install autophylo

Or

Create a conda environment using the environment.yml file which installs all the dependencies (listed below).

conda env create -f environment.yml

Usage

conda activate autophylo

Install autophylo using tarball

Install the autophylo application within the created autophylo conda environment using a tarball.

python3 -m pip install /path/to/autophylo-1.0.0.tar.gz

Dependencies

The following are required dependencies (listed below):

  • NCBI BLAST 2.15.0
  • BLAST databases (version 5)
  • FastTree 2.1.11
  • MUSCLE 5.1.0
  • RAxML 8.2.13
  • GBLOCKS 0.91b
  • USEARCH 12.0_beta
  • seqkit 2.8.2
  • Trimal 1.5.0
  • biopython 1.84
  • joblib 1.4.2

Download pre-formatted blast databases

(https://ftp.ncbi.nlm.nih.gov/blast/db/v5/README)

  • The pre-formatted databases offer the following advantages:
    • Pre-formatting removes the need to run makeblastdb;
    • Species-level taxonomy ids are included for each database entry;
    • Databases are broken into smaller-sized volumes and are therefore easier to download;
    • Sequences in FASTA format can be generated from the pre-formatted databases by using the blastdbcmd utility;
    • A convenient script (update_blastdb.pl) is available in the blast+ package to download the pre-formatted databases.

download nr

update_blastdb.pl --source ncbi --decompress --blastdb_version 5  --verbose 2 --num_threads 30 nr > log.nr 2>&1

Update config file

Obtain path to dependencies programs using the $CONDA_PREFIX variable.

After activating the autophylo run the following command to get the path and use it to update the config file

(autophylo) echo $CONDA_PREFIX

Example config file

(autophylo) amos@Amogelangs-MacBook-Pro autophylo % echo $CONDA_PREFIX
/Users/amos/miniconda3/envs/autophylo

Updated config file

NOTE: The databases can be place anywhere in the filesystem and in this example they are in /Users/amos/datalake.

[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes

[ALIGNMENT]
MUSCLE=/Users/amos/miniconda3/envs/autophylo/bin/muscle
TRIMAL=/Users/amos/miniconda3/envs/autophylo/bin/trimal

[TREE]
FastTree=/Users/amos/miniconda3/envs/autophylo/bin/FastTree
RAxML_PTHREADS=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-PTHREADS-AVX
RAxML_HYBRID=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-HYBRID-AVX

[CLUSTERING]
GBLOCKS=/Users/amos/miniconda3/envs/autophylo/bin/Gblocks
USEARCH=/Users/amos/miniconda3/envs/autophylo/bin/usearch

[DATABASES]
BLASTDB=/Users/amos/datalake/BLASTDB/NR/nr
TAXIDS=/Users/amos/datalake/BLASTDB/NR/taxids

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autophylo-1.0.4.tar.gz (501.2 kB view details)

Uploaded Source

Built Distribution

autophylo-1.0.4-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file autophylo-1.0.4.tar.gz.

File metadata

  • Download URL: autophylo-1.0.4.tar.gz
  • Upload date:
  • Size: 501.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for autophylo-1.0.4.tar.gz
Algorithm Hash digest
SHA256 01cec39ba113dcf833e9c1567bba1b5b653f7f680e6eaf890d31c4627c654f1b
MD5 be1c744fb118a5c5e28d2cc977567eef
BLAKE2b-256 ecd325e081dfde8ad80a73ef63754cd129731323df406d6338a43ec894fd0174

See more details on using hashes here.

File details

Details for the file autophylo-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: autophylo-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for autophylo-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9b9e9080b02a9d12813156743ce807a6769594232cfaa305ac355df2e53e4d13
MD5 7b3dd9cec125f1b03a60872b07b43797
BLAKE2b-256 a5de8c7d845fc5b4709758125547410175e21d33a17a803f92b5461b3b6b74ae

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page