Skip to main content

An embedding-based phage protein annotation tool by hierarchical assignment

Project description

EmPATHi
Embedding-based Phage Protein Annotation Tool by Hierarchical assignment

Table of Contents
  1. About the Project
  2. Getting Started
  3. Usage
  4. Contact

About the Project

Little description.

Preprint can be found at: [link]

Getting Started

EmPATHi has been packaged in PyPI and as an Apptainer container for ease of use.
The source code can also be downloaded from HuggingFace.

Prerequisites

The full list of dependencies and versions we tested to be compatible can be found in requirements.txt. Dependencies are taken care of by pip and Apptainer. See instructions below.

python/3.11.5
joblib==1.2.0
numpy==1.26.4
pandas==2.2.1
matplotlib==3.9.0
torch==2.3.0
scipy==1.13.1
scikit-learn==1.5.0
transformers==4.43.1
sentencepiece==0.2.0
seaborn==0.13.2

The models used by EmPATHi must be obtained seperately. See instructions below.
The models folder for EmPATHi must be obtained from HuggingFace.
ProtT5 must also be downloaded from HuggingFace.

Installation

First, create a virtual environement in python 3.11.5. This can be done using tools such as conda and virtualenv.

Download models for EmPATHi and ProtT5:

git lfs install
git clone https://huggingface.co/AlexandreBoulay/EmPATHi
git clone https://huggingface.co/Rostlab/prot_t5_xl_half_uniref50-enc Rostlab/prot_t5_xl_half_uniref50-enc
export PATH="/path/to/EmPATHi/models:$PATH"
export PATH="/path/to/Rostlab/prot_t5_xl_half_uniref50-enc:$PATH"

1. PIP

pip install empathi

2. Apptainer

Download or load Apptainer or singularity.

Fetch EmPATHi from docker hub:

apptainer fetch

Launch EmPATHi

export APPTAINER_BINDPATH="$PWD/path/to/input_file/" #or replace $PWD by absolute path
export APPTAINER_BINDPATH="$PWD/path/to/output_folder/"
apptainer run empathi.sif $PWD/path/to/input_file

3. From source code

Clone the repo if it isn't already done:

git lfs install
git clone https://huggingface.co/AlexandreBoulay/EmPATHi

Install dependencies:

cd EmPATHi
pip install -r requirements.txt

Usage

For pip:

python
from empathi import empathi
empathi(input_file, name, output_folder="path/to/output")

For Apptainer:

From command line:

python src/empathi/empathi.py -h

Options:

  • input_file: Path to input file containing protein sequencs (.fa*) or protein embeddings (.pkl/.csv).
  • name: Name of file you want to save to (wOut extension). Should be different between runs to avoid overwriting files.
  • --models_folder: Path to folder containing EmPATHi models. Can be left unspecified if it was added to PATH earlier.
  • --only_embeddings: Whether to only calculate embeddings (no functional prediction).
  • --output_folder: Path to the output folder. Default is ./empathi_out/.
  • --mode: Which types of proteins you want to predict. Accepted arguments are "all", "pvp", "rbp", "lysin", "regulator"...

When launching from python omit the '--' in front of args.

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

empathi-1.0.1.tar.gz (31.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

empathi-1.0.1-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file empathi-1.0.1.tar.gz.

File metadata

  • Download URL: empathi-1.0.1.tar.gz
  • Upload date:
  • Size: 31.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.11

File hashes

Hashes for empathi-1.0.1.tar.gz
Algorithm Hash digest
SHA256 ec562abe4cbf9333f711fe631729eca4b20779ab1132b6da02b21ae235e2e562
MD5 3eaedc406dd0c75c15323198dd0e9475
BLAKE2b-256 0ccfec0cad701c6917456ca9457dcbef7731e3aa8c24aba4e31689d94835cfec

See more details on using hashes here.

File details

Details for the file empathi-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: empathi-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.11

File hashes

Hashes for empathi-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0bba6ea4df5471dcfbd471dc4dd55c944b7f091b93725516ba9cee32f5d00ffb
MD5 dccc67f3dc114b330eb4954b0d0f3c5b
BLAKE2b-256 762a05d593184b3738543ed4c63a5a544b833ec713eaefbc1ee62e5d15a14ed9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page