Skip to main content

Python module for running Defense Predictor, a machine learning model to predict antiphage defense systems

Project description

DefensePredictor: A Machine Learning Model to Discover Novel Prokaryotic Immune Systems

Python package to run DefensePredictor, a machine-learning model that leverages embeddings from a protein language model, ESM2, to classify proteins as anti-phage defensive.

Installation

In a fresh conda or other virutal environment, run:

pip install defense_predictor
defense_predictor_download

The first command downloads the python package and the second command downloads the model weights. Once model weights are downloaded you do not need to run this command again.

Requirements

Requires python >= 3.10

Usage

defense_predictor can be run as python code

import defense_predictor as dfp

ncbi_feature_table = 'GCF_003333385.1_ASM333338v1_feature_table.txt'
ncbi_cds_from_genomic = 'GCF_003333385.1_ASM333338v1_cds_from_genomic.fna'
ncbi_protein_fasta = 'GCF_003333385.1_ASM333338v1_protein.faa'
output_df = dfp.run_defense_predictor(ncbi_feature_table=ncbi_feature_table,
                                      ncbi_cds_from_genomic=ncbi_cds_from_genomic,
                                      ncbi_protein_fasta=ncbi_protein_fasta)
output_df.head()                                    

Or from the command line

defense_predictor \
     --ncbi_feature_table GCF_003333385.1_ASM333338v1_feature_table.txt \
     --ncbi_cds_from_genomic GCF_003333385.1_ASM333338v1_cds_from_genomic.fna \ 
     --ncbi_protein_fasta GCF_003333385.1_ASM333338v1_protein.faa \
     --output GCF_003333385_defense_predictor_output.csv

defense_predictor outputs the predicted probability and log-odds of defense for each input protein. We reccomend using a stringent log-odds cutoff of 7.2 to call a protein predicted defensive.

To see an example you can run the defense_predictor_example.ipynb in colab: Open In Colab

We reccomend running defense_predictor on a computer with a cuda-enabled GPU, to maximize computational efficiency.

Inputs

Input files can be downloaded from the ftp webpage for any gemone of interest, which is linked on its assembly page. Input files can be generated from an unannotated nuceotide assembly using NCBI's Prokaryotic Genome Annotation Pipeline.

Alternatively, defense_predictor accepts inputs generated from prokka using the arguments prokka_gff, prokka_ffn, and prokka_faa.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

defense_predictor-0.1.2.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

defense_predictor-0.1.2-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file defense_predictor-0.1.2.tar.gz.

File metadata

  • Download URL: defense_predictor-0.1.2.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.13.1 Darwin/21.6.0

File hashes

Hashes for defense_predictor-0.1.2.tar.gz
Algorithm Hash digest
SHA256 687794c4753e1534505ebedfdb4c5a4bd3423ed10e9a90bd9837882e2d9a6e55
MD5 8fad18c70539d4fe7acbaf77ce99df4b
BLAKE2b-256 70b6ed97aad61ebeec375c41b2e7159f04971874dce912e1a9e1886b9ee16359

See more details on using hashes here.

File details

Details for the file defense_predictor-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: defense_predictor-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.13.1 Darwin/21.6.0

File hashes

Hashes for defense_predictor-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 faadb522931a236bc054da709babec35d5566aa83d1f63f2f8c8dcb58522629e
MD5 88106c914f9d7d7f6e34964c10561b09
BLAKE2b-256 2f7a3182f2d5f139dbc23fed2fce42f14ff9a28fcec398f6fdfbc2626636ccc7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page