Skip to main content

Antioxidant peptide predictor

Project description

ANTIOXIPRED

A method for predicting antioxidant potential in peptides

Introduction

AntioxiPred is a computational framework designed to predict the antioxidant potential of peptide sequences with high accuracy using machine learning approaches. The method integrates a similarity-based search using the Basic Local Alignment Search Tool (BLAST) with a CatBoost(Categorical Boosting) classifier. The predictive model is trained using composition-based features, including Pseudo Amino Acid Composition (PAAC) and one-hot encoded sequence profiles, enabling the capture of both global compositional characteristics and positional sequence information.

Requirements

scikit-learn=1.6.1

Pandas

Numpy

Joblib

You can set up the environment using either requirements.txt (for pip users) or environment.yml (for Conda users).

Using requirements.txt

pip install -r requirements.txt

Using environment.yml

conda env create -f env.yml

No additional package/tool is required for model = 1 (default model)

For the hybrid prediction mode (Model 2) in AntioxiPred requires the NCBI BLAST+ software.

BLAST executables are platform-specific. Therefore, users must download the appropriate version of NCBI BLAST+ from the official NCBI website:

https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/

Minimum USAGE

To know about the available option for the standlone, type the following command:

antioxipred -h

To run the example, type the following command:

antioxipred -f AOPP.test.2023.fasta -o output

Here, -f argument is to enter the input file in Fasta format and -o argument is for giving the path to the output directory. By default, the package uses model (-m) = 1 which employs only ML algorithm (Categorical Boosting) to classify the peptide sequences, which generates a prediction file "classification_ml_(datetime).csv" in the specified output directory. If model (-m) = 2 is selected, then the hybrid model is employed (ML + BLAST) to classify the peptide sequences, which generates a prediction file "classification_hybrid(datetime).csv" in the specified output directory.

Full Usage

usage: antioxipred [-h] --file FILE --output OUTPUT [--model MODEL] [--threshold THRESHOLD]
Please provide following arguments for successful run
required arguments:
  --file FILE, -f FILE                   Path to fasta file
  --output OUTPUT, -o OUTPUT             Path to output

optional arguments:

  --model MODEL, -m MODEL                Model selection: 1 for ML only, 2 for ML + BLAST (By default model = 1)
  --threshold THRESHOLD, -t THRESHOLD    Threshold for classification (can be any value between 0-1 for model = 1 (by default = 0.5) and 0-2 for model = 2 (by default = 0.52))

For help:
  -h, --help            show this help message and exit

Standalone Minimum Usage

Run the program using:

python3 antioxipred.py -f AOPP.test.2023.fasta -o result

Arguments Description

Input File (-f)

Allows users to provide input peptide sequences in FASTA format.

Output File (-o)

The program saves the prediction results in the specified output folder.

Model (-m)

Users can choose which model to run:

  • model = 1 → Runs only the Machine Learning model (CatBoost classifier)
  • model = 2 → Runs the Hybrid model (Machine Learning + BLAST)

Default: model = 1

Threshold (-t)

Users can provide a threshold for classification.

  • For model = 1 → threshold range: 0 – 1 (default = 0.50)
  • For model = 2 → threshold range: 0 – 2 (default = 0.50)

ANTIOXIPRED Package Files

The package contains the following files:

File Description
INSTALLATION Installation instructions
README.md Documentation and usage instructions
catboost_model_server.pkl Pickled CatBoost prediction model
antioxipred.py Main Python script used to run the prediction
pfeature_comp.py Python script used to extract the feature PAAC
AOPP.test.2023.fasta Example FASTA file containing peptide sequences
blast_db/ Database used for BLAST similarity search

Project details


Release history Release notifications | RSS feed

This version

1.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antioxipred-1.1.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

antioxipred-1.1-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file antioxipred-1.1.tar.gz.

File metadata

  • Download URL: antioxipred-1.1.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for antioxipred-1.1.tar.gz
Algorithm Hash digest
SHA256 7194ed639fe4e8183bbd170d957518f0386fc233f00bbdeafa793f6fc0571a54
MD5 572cba886118f02cfbaf380ac2843273
BLAKE2b-256 2ec8cd6b482cfc3711113f6c152653f9cfe2bacb0e9975c275320e92a6885a59

See more details on using hashes here.

File details

Details for the file antioxipred-1.1-py3-none-any.whl.

File metadata

  • Download URL: antioxipred-1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for antioxipred-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d1c584b629c4e1ed8dd3a80007e3e202eff17fec389d8a615b77e2d32266d3bd
MD5 a3dc19814f3701ee15161a377b0f1318
BLAKE2b-256 ddfe2feafdcfc30498fb64e173a1c5a1315ecbee9440185f4935dea335831d53

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page