Skip to main content

Autophylo is used to generate phylogentic trees automatically.

Project description

The Automatic Phylogentic Tree Builder

The Automatic Phylogentic (AutoPhylo) is used to generate phylogentic trees automatically by sampling the selected database (e.g., NCBI nr) and performs all tasks associated with traditional phylogentic tree building which includes trimming, dropping overly similar sequences, and generating an maximum likelihood (ML) tree using RAxML.

The user provides one protein sequence or multiple sequences and blast databases are used to sample sequences from phyla selected by the user e.g., The following can be used for phyla found in the human gut i.e., Actinobacteriota, Bacteroidota, Desulfobacterota, Firmicutes, Proteobacteria, Synergistota, Verrucomicrobiota, Fusobacteria.

Installation

The tool requires Python >= 3.11 and conda >= 4.12.0. The latest release can be installed directly from pip or this repository.

pip install autophylo

Or

Create a conda environment using the environment.yml file which installs all the dependencies (listed below).

conda env create -f environment.yml

Usage

conda activate autophylo

Install autophylo using tarball

Install the autophylo application within the created autophylo conda environment using a tarball.

python3 -m pip install /path/to/autophylo-1.0.0.tar.gz

Dependencies

The following are required dependencies (listed below):

  • NCBI BLAST 2.15.0
  • BLAST databases (version 5)
  • FastTree 2.1.11
  • MUSCLE 5.1.0
  • RAxML 8.2.13
  • GBLOCKS 0.91b
  • USEARCH 12.0_beta
  • seqkit 2.8.2
  • Trimal 1.5.0
  • biopython 1.84
  • joblib 1.4.2

Download pre-formatted blast databases

(https://ftp.ncbi.nlm.nih.gov/blast/db/v5/README)

  • The pre-formatted databases offer the following advantages:
    • Pre-formatting removes the need to run makeblastdb;
    • Species-level taxonomy ids are included for each database entry;
    • Databases are broken into smaller-sized volumes and are therefore easier to download;
    • Sequences in FASTA format can be generated from the pre-formatted databases by using the blastdbcmd utility;
    • A convenient script (update_blastdb.pl) is available in the blast+ package to download the pre-formatted databases.

download nr

update_blastdb.pl --source ncbi --decompress --blastdb_version 5  --verbose 2 --num_threads 30 nr > log.nr 2>&1

Update config file

Obtain path to dependencies programs using the $CONDA_PREFIX variable.

After activating the autophylo run the following command to get the path and use it to update the config file

(autophylo) echo $CONDA_PREFIX

Example config file

(autophylo) amos@Amogelangs-MacBook-Pro autophylo % echo $CONDA_PREFIX
/Users/amos/miniconda3/envs/autophylo

Updated config file

NOTE: The databases can be place anywhere in the filesystem and in this example they are in /Users/amos/datalake.

[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes

[ALIGNMENT]
MUSCLE=/Users/amos/miniconda3/envs/autophylo/bin/muscle
TRIMAL=/Users/amos/miniconda3/envs/autophylo/bin/trimal

[TREE]
FastTree=/Users/amos/miniconda3/envs/autophylo/bin/FastTree
RAxML_PTHREADS=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-PTHREADS-AVX
RAxML_HYBRID=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-HYBRID-AVX

[CLUSTERING]
GBLOCKS=/Users/amos/miniconda3/envs/autophylo/bin/Gblocks
USEARCH=/Users/amos/miniconda3/envs/autophylo/bin/usearch

[DATABASES]
BLASTDB=/Users/amos/datalake/BLASTDB/NR/nr
TAXIDS=/Users/amos/datalake/BLASTDB/NR/taxids

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autophylo-1.0.0.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

autophylo-1.0.0-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file autophylo-1.0.0.tar.gz.

File metadata

  • Download URL: autophylo-1.0.0.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for autophylo-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a180a37687985f150428396cbf7f98b6914c700ce4e473b21bbd5920d0905122
MD5 197ab9b0076cde6deafdfa846f123c59
BLAKE2b-256 c4cc581f5be4b01d4d3bb2484dcfd36035cc3823930e2e3eab606c0f6d8e6d87

See more details on using hashes here.

File details

Details for the file autophylo-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: autophylo-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for autophylo-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6500e7a3c44e494b7c43ad2f48aae0d32ad3848de45f6ba92d24ec655991a8b4
MD5 e502bf1883e6db5044c9f48d87216e23
BLAKE2b-256 5cedec05ce2fde783de72149ce7255897a8c9d4987f196c584d610b7af1eedfd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page