Skip to main content

Phynteny: Synteny-based prediction of bacteriophage genes

Project description

phynteny logo

Phynteny: Synteny-based annotation of bacteriophage genes Edwards Lab License: MIT DOI GitHub language count CI PyPI version Downloads Anaconda-Server Badge Conda

Approximately 65% of all bacteriophage (phage) genes cannot be attributed a known biological function. Phynteny uses a long-short term memory model trained on phage synteny (the conserved gene order across phages) to assign hypothetical phage proteins to a PHROG category.

Phynteny is still a work in progress and the LSTM model has not yet been optimised. Use with caution!

NOTE: This version of Phynteny will only annotate phages with 120 genes or less due to the architecture of the LSTM. We aim to adjust this in future versions.

Dependencies

Phynteny installation requires Python 3.8 or above. You will need the following python dependencies to run Phynteny and its related support scripts. The latest tested versions of the dependencies are:

  • python - version 3.10.0
  • sklearn - version 1.2.2
  • biopython - version 1.81
  • numpy - version 1.21.0 (Windows, Linux, Apple Intel), version 1.24.0 (Apple M1/M2)
  • tensorflow - version 2.9.0 (Windows, Linux, Apple Intel), tensorflow-macos version 2.11 (Apple M1/M2)
  • pandas - version 2.0.2
  • loguru - version 0.7.0
  • click - version 8.1.3

We recommend GPU support if you are training Phynteny. This requires CUDA and cuDNN:

Installation

Option 1: Installing Phynteny using conda (recommended)

You can install Phynteny from bioconda at https://anaconda.org/bioconda/phynteny. Make sure you have conda installed.

# create conda environment and install phynteny 
conda create -n phynteny -c bioconda phynteny
 
# activate environment
conda activate phynteny

# install phynteny
conda install -c bioconda phynteny

NOTE: bioconda installations of Phynteny do not have GPU support. This is fine for most uses but not does not enable training of phynteny models.

Now you can go to Install Models to install pre-trained phynteny models.

Option 2: Installing Phynteny using pip

You can install Phynteny from PyPI at https://pypi.org/project/phynteny/. Make sure you have pip and mamba installed.

pip install phynteny

NOTE: pip installation is recommended for training Phynteny models

Now you can go to Install Models to install pre-trained phynteny models.

Option 3: Installing Phynteny from source

If all else fails you can install Phynteny from this repo.

git clone https://github.com/susiegriggo/Phynteny.git --branch main --depth 1 
cd Phynteny 
pip install . 

Now you can go to Install Models to install pre-trained phynteny models.

Install Models

Once you've installed Phynteny you'll need to download the pre-trained models

install_models 

If you would like to specify a particular location to download the models run

install_models -o <path/to/database_dir>

If for some reason this does not work. you can download the pre-trained models from Zenodo and untar in a location of your choice.

Usage

Phynteny takes a genbank file containing PHROG annotations as input. If your phage is not yet in this format, pharokka can take your phage (in fasta format) to a genbank file with PHROG annotations. Phynteny will then return a genbank files and a table containing the details of the predictions made using phynteny. Each prediction is accompanied by a 'phynteny score' which ranges between 1-10 and a recalibrated confidence score.

Reccomended

phynteny tests/data/test_phage.gbk  -o test_phynteny

Custom

If you wish to specify your own LSTM model, run:

phynteny test_phage.gbk -o test_phage_phynteny -m your_models -t confidence_dict.pkl 

Details of how to train the phynteny models and generate confidence estimates is detailed below.

Train Phynteny

Phynteny has already been trained for you on a dataset containing over 1 million prophages! If you feel inclined to generate your own Phynteny model using your own dataset, instructions and training scripts are provided here.

Performance

Coming soon: Notebooks demonstrating the performance of the model

Bugs and Suggestions

If you break Phynteny or would like to make any suggestions please open an issue or email me at susie.grigson@flinders.edu.au

Wow! How can I cite this incredible piece of work?

The Phynteny manuscript is currently in preparation. In the meantime, please cite Phynteny as:

Grigson, S. R.,  Mallawaarachchi, V., Roach, M. R., Papudeshi, B., Bouras, G., Decewicz, P., Dinsdale, E. A. & Edwards, R. A. (2023). Phynteny: Synteny-based annotation of phage genomes. DOI: 10.5281/zenodo.8128917

If you use pharokka to annotate your phage before using Phynteny please cite it as well:

Bouras, G., Nepal, R., Houtak, G., Psaltis, A. J., Wormald, P. J., & Vreugde, S. (2023). Pharokka: a fast scalable bacteriophage annotation tool. Bioinformatics, 39(1), btac776.

If you found Phynteny useful and would like to get even better annotations for your phages check out phold!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phynteny-0.1.13.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

phynteny-0.1.13-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file phynteny-0.1.13.tar.gz.

File metadata

  • Download URL: phynteny-0.1.13.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.17

File hashes

Hashes for phynteny-0.1.13.tar.gz
Algorithm Hash digest
SHA256 319fcdfa7b4144bfc360991c247a488e01e105c46bdc1b1b71dfa65273ed9c24
MD5 e98452b47742c5439a7a55e2c01a37df
BLAKE2b-256 7ec1eab7600656162ac1853b401c0797649e9819d85e10ef9f5ee1ad6bc612b7

See more details on using hashes here.

File details

Details for the file phynteny-0.1.13-py3-none-any.whl.

File metadata

  • Download URL: phynteny-0.1.13-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.17

File hashes

Hashes for phynteny-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 c321a2fcb4af572bde33fabaacbf2f68fcbedc218bc5db49984f267ac6b22ffc
MD5 d71cadea53900a6ea4f50e843a7785fd
BLAKE2b-256 977ef29899da803d477a20e40404f9fcc5d69ad549ff943dd3d8fca084882947

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page