Prophage finder using multiple metrics

These details have not been verified by PyPI

Project links

Homepage

Project description

GitHub language count

#What is PhiSpy?

PhiSpy identifies prophages in Bacterial (and probably Archaeal) genomes. Given an annotated genome it will use several approaches to identify the most likely prophage regions.

Initial versions of PhiSpy were written by

Sajia Akhter (sajia@stanford.edu) Edwards Bioinformatics Lab

Improvements, bug fixes, and other changes were made by

Katelyn McNair Edwards Bioinformatics Lab and Przemyslaw Decewicz University of Warsaw

Software Requirements

PhiSpy requires following programs to be installed in the system. Most of these are likely already on your system.

Python - version 3.4 or later
Biopython - version 1.58 or later
gcc - GNU project C and C++ compiler - version 4.4.1 or later
The Python.h header file. This is included in python3-dev that is available on most systems.

INSTALLATION

For a brand new Ubuntu installation (e.g. on Google Cloud Platform you can install these dependencies with these commands:

sudo apt install -y build-essential python3-dev python3-pip
python3 -m pip install --user PhiSpy

This will install PhiSpy.py in ~/.local/bin which should be in your $PATH but might not be (see this detailed discussion).

If you try PhiSpy.py -v and get an error like this:

$ PhiSpy.py -v
-bash: PhiSpy.py: command not found

Then you can either use the full path:

~/.local/bin/PhiSpy.py -v

or add that location to your $PATH:

echo "export PATH=\$HOME/.local/bin:\$PATH" >> ~/.bashrc
source ~/.bashrc
PhiSpy.py -v

Advanced Users

For advanced users, you can clone the git repository and use that (though pip is the recommended install method).

git clone https://github.com/linsalrob/PhiSpy.git
cd PhiSpy`
python3 setup.py install --user

If you have root and you want to install globally, you can change the setup command.

git clone https://github.com/linsalrob/PhiSpy.git
cd PhiSpy`
python3 setup.py install

For ease of use, you may wish to add the location of PhiSpy.py to your $PATH.

Testing PhiSpy.py

Download the Streptococcus pyogenes M1 genome

curl -Lo Streptococcus_pyogenes_M1_GAS.gb https://bit.ly/37qFArb
PhiSpy.py -o Streptococcus.phages Streptococcus_pyogenes_M1_GAS.gb

or to run it with the Streptococcus training set:

PhiSpy.py -o Streptococcus.phages -t data/trainSet_160490.61.txt Streptococcus_pyogenes_M1_GAS.gb

This uses the GenBank format file for Streptococcus pyogenes M1 GAS that we provide in the tests/ directory, and we use the training set for S. pyogenes M1 GAS that we have pre-calculated. This quickly identifies the four prophages in this genome, runs the repeat finder on all of them, and outputs the answers.

You will find the output files from this query in output_directory.

Running PhiSpy.py

The simplest command is:

PhiSpy.py -f genbank_file -o output_directory

where:

genbank file: The input DNA sequence file in GenBank format.
output directory: The output directory is the directory where the final output file will be created.

If you have new genome, we recommend annotating it using the RAST server or PROKKA.

After annotation, you can download the genome directory from the server.

Help

For the help menu use the -h option:

python PhiSpy.py -h

Output Files

There are 3 output files, located in output directory.

prophage.tbl: This file has two columns separated by tabs [id, location]. The id is in the format: pp_number, where number is a sequential number of the prophage (starting at 1). Location is be in the format: contig_start_stop that encompasses the prophage.
prophage_tbl.tsv: This is a tab seperated file. The file contains all the genes of the genome. The tenth colum represents the status of a gene. If this column is 1 then the gene is a phage like gene; otherwise it is a bacterial gene.

This file has 16 columns:(i) fig_no: the id of each gene; (ii) function: function of the gene; (iii) contig; (iv) start: start location of the gene; (v) stop: end location of the gene; (vi) position: a sequential number of the gene (starting at 1); (vii) rank: rank of each gene provided by random forest; (viii) my_status: status of each gene based on random forest; (ix) pp: classification of each gene based on their function; (x) Final_status: the status of each gene. For prophages, this column has the number of the prophage as listed in prophage.tbl above; If the column contains a 0 we believe that it is a bacterial gene. If we can detect the att sites, the additional columns will be: (xi) start of attL; (xii) end of attL; (xiii) start of attR; (xiv) end of attR; (xv) sequence of attL; (xvi) sequence of attR.

prophage_coordinates.tsv: This file has the prophage ID, contig, start, stop, and potential att sites identified for the phages.

Example Data

Streptococcus pyogenes M1 GAS which has a single genome contig. The genome contains four prophages.

To analyze this data, you can use:

PhiSpy.py -o output_directory -t data/trainSet_160490.61.txt tests/Streptococcus_pyogenes_M1_GAS.gb

And you should get a prophage table that has this information (for example, take a look at output_directory/prophage.tbl).

Prophage number	Contig	Start	Stop
pp_1	NC_002737	529631	569288
pp_2	NC_002737	778642	820599
pp_3	NC_002737	1192630	1222549
pp_4	NC_002737	1775862	1782822

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

5.0.2

Feb 2, 2026

5.0.1

Feb 2, 2026

4.2.21

Feb 2, 2022

4.2.19

Jul 5, 2021

4.2.18

Jul 5, 2021

4.2.17

May 17, 2021

4.2.16

May 17, 2021

4.2.15

May 15, 2021

4.2.12

May 9, 2021

4.2.6

Oct 15, 2020

4.1.22

Aug 26, 2020

4.1.21

Aug 26, 2020

4.1.20

Aug 24, 2020

4.1.19

Aug 24, 2020

4.1.18

Aug 24, 2020

4.1.17

Aug 17, 2020

4.1.16

Jul 21, 2020

4.1.14

Jul 12, 2020

4.1.13

Jul 12, 2020

4.1.12

Jul 9, 2020

4.1.11

Jul 9, 2020

4.1.7

Jul 7, 2020

4.1.0

Jun 26, 2020

4.1rc6 pre-release

Jun 20, 2020

4.1rc4 pre-release

Jun 20, 2020

4.1rc1 pre-release

Jun 18, 2020

4.0.3

May 26, 2020

4.0.2

May 26, 2020

4.0.0

May 22, 2020

This version

3.7.8

Dec 31, 2019

3.7.7

Dec 29, 2019

3.7.6

Dec 28, 2019

3.7.5

Dec 27, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PhiSpy-3.7.8.tar.gz (12.3 MB view details)

Uploaded Dec 31, 2019 Source

File details

Details for the file PhiSpy-3.7.8.tar.gz.

File metadata

Download URL: PhiSpy-3.7.8.tar.gz
Upload date: Dec 31, 2019
Size: 12.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.7.5

File hashes

Hashes for PhiSpy-3.7.8.tar.gz
Algorithm	Hash digest
SHA256	`81eedc6fe00bd68ed4e0fd257c9e800395e89a3f7461643872781c12def75052`
MD5	`4ca96827942afdf1c175cf0403235b22`
BLAKE2b-256	`944eb94962d068226323cf20373027308a9d4ac594128368818e85d5b907ad11`

See more details on using hashes here.

PhiSpy 3.7.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Software Requirements

INSTALLATION

Advanced Users

Testing PhiSpy.py

Running PhiSpy.py

Help

Output Files

Example Data

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes