Skip to main content

PHAGES2050 is a novel Python 3.8+ programming language framework to boost bacteriophage research & therapy

Project description

"Keep calm, use AI for phages and stop AMR"

PHAGES2050 is a novel Python 3.8+ programming language framework to boost bacteriophage research & therapy and infrastructure in order to achieve the full potential to fight against antimicrobial resistant bacteria within Natural Language Processing (NLP) and Deep Learning.

Our project is about developing a AI-based framework for microbiologists and bioinformaticians who hunt, explore and classify phages. Applying the framework will shorten the duration of computational methods required to match phages with bacteria for specific patient cases. Having such organised framework at hand and freely-available will help develop personalized phage therapy and make it accessible to people worldwide.

Watch the PHAVES #3 talk to learn more.

Travis CI codecov Documentation Status PyPI version PyPI license PyPI pyversions Code style Downloads

Table of Contents

Framework modules | Usage | Documentation | Installation | Community and Contributions | Have a question? | Found a bug? | Team | Change log | Code of Conduct | License

Framework modules

crawlers - set of functions responsible for bacteriophages data scraping from different sources (MillardLab, NCBI)
features - set of functions responsible for nucleotides and proteins feature extraction for Machine Learning classification and deeper analysis
embeddings - set of pre-trained Embedding models for nucleotides and proteins vectorization
classifiers - set of pre-trained Machine Learning models dedicated for bacteriophage research
explore - set of data visualization techniques in 2D or 3D dedicated for deeper bacteriophages exploration

Usage

The repository includes numerous examples of using the framework in Jupyter Notebook format (*.ipynb). The most expected ones by the community are listed below:

Crawlers
  • MillardLab bacteriophage crawler
  • NCBI bacteriophages crawlers (planned):
    • taxonomy, host and other expected meta-data;
    • complete genome sequences in FASTA format;
    • set of genes and proteins in FASTA format;
Embeddings
Classifiers
Explore
  • Bacteriophages in 3D space based on:
    • DNA embedding (planned)
    • proteins embedding (planned)
    • biological and biochemical features (planned)
    • custom user features (planned)

Documentation

The official documentation is hosted on ReadTheDocs: https://phages2050.readthedocs.io

Installation

PHAGES2050 can be installed by running:

pip install phages2050

It requires Python 3.8.0+ to run. You can also use Conda:

conda install -c conda-forge phages2050

Install from GitHub

If you can't wait for the latest hotness and want to install from GitHub, use:

pip install git+git://github.com/ptynecki/PHAGES2050

Proteins' embedding

If you want to use Bacteriophage proteins vectorizers then remember to install extra package for proteins embedding:

pip install -U "bio-embeddings[all] @ git+https://github.com/sacdallago/bio_embeddings.git"
pip install git+https://github.com/facebookresearch/esm.git

Community and Contributions

Happy to see you willing to make the PHAGES2050 better. Development on the latest stable version of Python 3+ is preferred. As of this writing it's 3.8. You can use any operating system.

If you're fixing a bug or adding a new feature, add a test with pytest and check the code with Black and mypy. Before adding any large feature, first open an issue for us to discuss the idea with the core devs and community.

Have a question?

Obviously if you have a private question or want to cooperate with us, you can always reach out to us directly via our Phage Directory Slack (channel #PHAGES2050).

Found a bug?

Feel free to add a new issue with a respective title and description on the the PHAGES2050 repository. If you already found a solution to your problem, we would be happy to review your pull request.

Team

Core Developers, Domain Experts, Community Managers and Educators who contributing to PHAGES2050:

  • Piotr Tynecki
  • Yana Minina
  • Iwona Świętochowska
  • Przemysław Mitura
  • Joanna Kazimierczak
  • Arkadiusz Guziński
  • Bogusław Zimnoch
  • Jessica Sacher, PhD
  • Shawna McCallin, PhD
  • Marie-Agnes Petit, PhD
  • Jan Zheng

Change log

The log's will become rather long. It moved to its own file.

See CHANGELOG.md.

Code of Conduct

Everyone interacting in the PHAGES2050 project's development, issue trackers and Slack discussion is expected to follow the Code of Conduct.

License

The PHAGES2050 package and pre-trained models are released under the under terms of the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phages2050-0.0.8.tar.gz (24.2 kB view details)

Uploaded Source

File details

Details for the file phages2050-0.0.8.tar.gz.

File metadata

  • Download URL: phages2050-0.0.8.tar.gz
  • Upload date:
  • Size: 24.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.5

File hashes

Hashes for phages2050-0.0.8.tar.gz
Algorithm Hash digest
SHA256 38c1c1117145418e2609863be999d3050802a56f3d748a9bd9ecc6b225fcbc43
MD5 8c0aae777072c2f5f78c9d6e7681d032
BLAKE2b-256 8eb50886bc513cb8d75c39ceb7fc5819ef5480e4cb6159d1da89b916cec70d67

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page