Skip to main content

Probabilistic Phage Protein Functions: Phage genomes and their annotations

Project description

Edwards Lab

PPPF

Probabilistic Phage Protein Families

Author

Rob Edwards

Synopsis

We are exploring different ways of annotating phage proteins (because it never gets old), and this is a database of complete phage genomes and their annotations.

It also includes some phage protein clustering and tools associated with those clusters.

At the moment this is very much a pre-alpha project. We are defining the tables and relations, building the code base to access those tables, and trying to explore what we should do next.

However, we have made all our data, and the code to recreate it, available for everyone in case it is of use to anyone.

Installation

PIP installation

pip install pppf

Getting started

The [download_databases](python scripts/download_databases.py) script will download the two databases phages.sql [2.6 GB] and clusters.sql [1.8 GB] to the default location (currently PPPF/data/databases/) or to a location of your choosing.

Most of the code in scripts requires that you provide a phage or clusters database as a command line option, but we are implementing code in pppfdb that will use the default location. If you use a different location, you may need to change the location in that code.

Building from scratch

If you want to build the databases from scratch, you can do so using snakemake and the snakefiles that we provide.

Then, you can use snakemake to start making it better. Probably.

You will need a process_phages.json file, and then you can update the databases with the latest phage genomes like this:

snakemake -s PPPF/snakefiles/download_phages.snakefile --configfile process_phages.json

if you are running on Edwards' local compute resources, you can use this command to run the download on the cluster.

snakemake -s ~/GitHubs/PPPF/snakefiles/download_phages.snakefile --cluster 'qsub -cwd -o sge_download.out -e sge_download.err -V' -j 200 --latency-wait 60

It will download a new set of accessions, and then check the database to see what needs to be added. Note that currently we do not delete anything from the database.

Using PPPF

The basic structure is that each of the directories is a library, and the scripts directory contains scripts that use those libraries.

Take a look at the database schema for a more detailed discussion of the schema we designed.

Information

License

PPPF is released under the MIT License

Issues

Please use the issue tracker for any bugs, enhancements, suggestions, or comments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PPPF-0.1.0.tar.gz (18.9 kB view details)

Uploaded Source

Built Distribution

PPPF-0.1.0-py3-none-any.whl (31.3 kB view details)

Uploaded Python 3

File details

Details for the file PPPF-0.1.0.tar.gz.

File metadata

  • Download URL: PPPF-0.1.0.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.5

File hashes

Hashes for PPPF-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c714275a06b307f08b66ddf4e73ca8074286186b0fabfdacb1494e21f64da506
MD5 dd75bb62a7fd40aa9534987dabcf8fb5
BLAKE2b-256 edf8d5262ededeb717ebd97f2a20f373870d17ecec4e610cea22f02482c4b4d5

See more details on using hashes here.

File details

Details for the file PPPF-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: PPPF-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.5

File hashes

Hashes for PPPF-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 08dad2a54a2f1143cd745ddfc6aff922c38f1be595edd7b3b5f5d3f084b9ecc0
MD5 b868d22021a9643b919833fbe28126b4
BLAKE2b-256 ae9046670c056b394026f929656c44b9d69baedf1a830b18aaf9f10302cba6a1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page