Skip to main content

AntiFP2: A tool for prediction of Antifungal Proteins

Project description

AntiFP2

AntiFP2 is a tool for the prediction of antifungal proteins using a fine-tuned ESM2 language model, optionally enhanced by post-prediction adjustment with BLAST and MERCI motif detection.

This pipeline combines deep learning-based embeddings with classical bioinformatics methods for improved reliability in antifungal protein prediction.


🚀 Features

  • Fine-tuned ESM2-t36_3B_UR50D model for antifungal prediction
  • Post-prediction adjustment using:
    • BLAST: Sequence similarity matching to known antifungal/negative examples
    • MERCI: Motif Enrichment Recognition to enhance biological relevance
  • One-by-one and batch prediction modes
  • Rejection logging for low-quality or invalid sequences
  • Hugging Face integration for model loading

📦 Installation

pip install git+https://github.com/patrik-ackerman/antifp2.git

Note: Requires Python ≥ 3.12

Ensure that BLAST+ and MERCI binaries are properly configured via the envfile as shown below.


📁 Project Structure

antifp2/
│
├── python_scripts/
│   ├── antifp2_ESM2.py      # Main pipeline with ESM2 + BLAST + MERCI
│   ├── antifp2_BLAST.py     # ESM2-only one-by-one predictor
│   └── envfile              # Config file for paths to BLAST and MERCI tools
│
├── MERCI/                   # MERCI motif files
├── blast_db/                # Preformatted BLAST database
├── README.md
├── setup.py
└── ...

🧪 Usage

🔮 ESM2-Only Prediction (one-by-one)

antifp2_blast --fasta path/to/input.fasta --output results.csv
  • Output will include:

    • ID, probability, prediction columns
  • Logs invalid sequences to rejected_log.txt

🧬 Full Pipeline (ESM2 + BLAST + MERCI)

antifp2_esm --fasta path/to/input.fasta --output ./output_dir/
  • Performs predictions

  • Runs BLAST against provided database

  • Executes MERCI with motif file

  • Adjusts predictions and saves final output to:

    • output_dir/<input>.adjusted.csv

Optional flag:

--no-cleanup   # Retains intermediate files like raw BLAST output, logs, etc.

🔧 Configuring Environment

The tool reads environment-specific paths from a file named envfile. Example format:

# Path settings for different OS
BLAST_ubuntu=/usr/bin/blastp
BLAST_windows=C:/Program Files/NCBI/blastp.exe
BLAST_macos=/usr/local/bin/blastp

BLAST_database=antifp2/blast_db/antifungal_db
MERCI=antifp2/MERCI/merci
MERCI_motif_file=antifp2/MERCI/motifs.motif

Make sure this file is located in antifp2/python_scripts/envfile.


📋 Output Format

adjusted.csv columns:

Column Description
ID Sequence ID from FASTA
probability Raw ESM2-based antifungal probability
blast_adjustment Adjustment based on BLAST hit
motif_adjustment Adjustment based on MERCI hit
combined Final adjusted probability
prediction 1 if combined ≥ 0.5, else 0

💾 Model Files

Downloaded automatically from Hugging Face:

  • config.json
  • pytorch_model.bin
  • alphabet.bin

Repo: raghavagps-group/antifp2


📝 License

This project is licensed under the terms of the MIT License. See the LICENSE.txt file for details.


👨‍🔬 Author

Pratik Shinde Indian Institute of Information Technology Delhi Email


🌐 Links


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antifp2-1.0.0.tar.gz (29.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

antifp2-1.0.0-py3-none-any.whl (29.9 MB view details)

Uploaded Python 3

File details

Details for the file antifp2-1.0.0.tar.gz.

File metadata

  • Download URL: antifp2-1.0.0.tar.gz
  • Upload date:
  • Size: 29.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for antifp2-1.0.0.tar.gz
Algorithm Hash digest
SHA256 625cfb00039dfad956bb1c3868e64d5ae9799f07312ec4fb3df16907c50968b9
MD5 3a3d775ac7b2679135e0b3ce3e560894
BLAKE2b-256 02840e31e8a3306a47e5c20d4f622535b6ae42fd33b93d883c74fba1a7441779

See more details on using hashes here.

File details

Details for the file antifp2-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: antifp2-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 29.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for antifp2-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 05def1a5b6c9a4b4aea27877080c80fd6679f3a822fc7b7a96935c52da8b635d
MD5 379b320096f0e8c70496c8f64d48b0d4
BLAKE2b-256 82f3b0e2dc1327aaf7ce86c30df4676d037deb2fda8a389497e2c55fb243da25

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page