Autophylo is used to generate phylogentic trees automatically.
Project description
The Automatic Phylogentic Tree Builder
The Automatic Phylogentic (AutoPhylo) is used to generate phylogentic trees automatically by sampling the selected database (e.g., NCBI nr) and performs all tasks associated with traditional phylogentic tree building which includes trimming, dropping overly similar sequences, and generating an maximum likelihood (ML) tree using RAxML.
The user provides one protein sequence or multiple sequences and blast databases are used to sample sequences from phyla selected by the user e.g., The following can be used for phyla found in the human gut i.e., Actinobacteriota, Bacteroidota, Desulfobacterota, Firmicutes, Proteobacteria, Synergistota, Verrucomicrobiota, Fusobacteria.
Installation
The tool requires Python >= 3.11 and conda >= 4.12.0. The latest release can be installed directly from pip or this repository.
pip install autophylo
Or
Create a conda environment using the environment.yml
file which installs all the dependencies (listed below).
conda env create -f environment.yml
Usage
conda activate autophylo
Install autophylo using tarball
Install the autophylo
application within the created autophylo
conda environment using a tarball.
python3 -m pip install /path/to/autophylo-1.0.0.tar.gz
Dependencies
The following are required dependencies (listed below):
- NCBI BLAST 2.15.0
- BLAST databases (version 5)
- FastTree 2.1.11
- MUSCLE 5.1.0
- RAxML 8.2.13
- GBLOCKS 0.91b
- USEARCH 12.0_beta
- seqkit 2.8.2
- Trimal 1.5.0
- biopython 1.84
- joblib 1.4.2
Download pre-formatted blast databases
(https://ftp.ncbi.nlm.nih.gov/blast/db/v5/README)
- The pre-formatted databases offer the following advantages:
- Pre-formatting removes the need to run makeblastdb;
- Species-level taxonomy ids are included for each database entry;
- Databases are broken into smaller-sized volumes and are therefore easier to download;
- Sequences in FASTA format can be generated from the pre-formatted databases by using the blastdbcmd utility;
- A convenient script (update_blastdb.pl) is available in the blast+ package to download the pre-formatted databases.
download nr
update_blastdb.pl --source ncbi --decompress --blastdb_version 5 --verbose 2 --num_threads 30 nr > log.nr 2>&1
Update config
file
Obtain path to dependencies programs using the $CONDA_PREFIX
variable.
After activating the autophylo
run the following command to get the path and use it to update the config
file
(autophylo) echo $CONDA_PREFIX
Example config
file
(autophylo) amos@Amogelangs-MacBook-Pro autophylo % echo $CONDA_PREFIX
/Users/amos/miniconda3/envs/autophylo
Updated config
file
NOTE: The databases can be place anywhere in the filesystem and in this example they are in /Users/amos/datalake
.
[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes
[ALIGNMENT]
MUSCLE=/Users/amos/miniconda3/envs/autophylo/bin/muscle
TRIMAL=/Users/amos/miniconda3/envs/autophylo/bin/trimal
[TREE]
FastTree=/Users/amos/miniconda3/envs/autophylo/bin/FastTree
RAxML_PTHREADS=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-PTHREADS-AVX
RAxML_HYBRID=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-HYBRID-AVX
[CLUSTERING]
GBLOCKS=/Users/amos/miniconda3/envs/autophylo/bin/Gblocks
USEARCH=/Users/amos/miniconda3/envs/autophylo/bin/usearch
[DATABASES]
BLASTDB=/Users/amos/datalake/BLASTDB/NR/nr
TAXIDS=/Users/amos/datalake/BLASTDB/NR/taxids
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file autophylo-1.0.0.tar.gz
.
File metadata
- Download URL: autophylo-1.0.0.tar.gz
- Upload date:
- Size: 18.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a180a37687985f150428396cbf7f98b6914c700ce4e473b21bbd5920d0905122 |
|
MD5 | 197ab9b0076cde6deafdfa846f123c59 |
|
BLAKE2b-256 | c4cc581f5be4b01d4d3bb2484dcfd36035cc3823930e2e3eab606c0f6d8e6d87 |
File details
Details for the file autophylo-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: autophylo-1.0.0-py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6500e7a3c44e494b7c43ad2f48aae0d32ad3848de45f6ba92d24ec655991a8b4 |
|
MD5 | e502bf1883e6db5044c9f48d87216e23 |
|
BLAKE2b-256 | 5cedec05ce2fde783de72149ce7255897a8c9d4987f196c584d610b7af1eedfd |