Antioxidant peptide predictor
Project description
ANTIOXIPRED
A method for predicting antioxidant potential in peptides
Introduction
AntioxiPred is a computational framework designed to predict the antioxidant potential of peptide sequences with high accuracy using machine learning approaches. The method integrates a similarity-based search using the Basic Local Alignment Search Tool (BLAST) with a CatBoost(Categorical Boosting) classifier. The predictive model is trained using composition-based features, including Pseudo Amino Acid Composition (PAAC) and one-hot encoded sequence profiles, enabling the capture of both global compositional characteristics and positional sequence information.
Requirements
scikit-learn=1.6.1
Pandas
Numpy
Joblib
You can set up the environment using either requirements.txt (for pip users) or environment.yml (for Conda users).
Using requirements.txt
pip install -r requirements.txt
Using environment.yml
conda env create -f env.yml
No additional package/tool is required for model = 1 (default model)
For the hybrid prediction mode (Model 2) in AntioxiPred requires the NCBI BLAST+ software.
BLAST executables are platform-specific. Therefore, users must download the appropriate version of NCBI BLAST+ from the official NCBI website:
https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
Minimum USAGE
To know about the available option for the standlone, type the following command:
antioxipred -h
To run the example, type the following command:
antioxipred -f AOPP.test.2023.fasta -o output
Here, -f argument is to enter the input file in Fasta format and -o argument is for giving the path to the output directory. By default, the package uses model (-m) = 1 which employs only ML algorithm (Categorical Boosting) to classify the peptide sequences, which generates a prediction file "classification_ml_(datetime).csv" in the specified output directory. If model (-m) = 2 is selected, then the hybrid model is employed (ML + BLAST) to classify the peptide sequences, which generates a prediction file "classification_hybrid(datetime).csv" in the specified output directory.
Full Usage
usage: antioxipred [-h] --file FILE --output OUTPUT [--model MODEL] [--threshold THRESHOLD]
Please provide following arguments for successful run
required arguments:
--file FILE, -f FILE Path to fasta file
--output OUTPUT, -o OUTPUT Path to output
optional arguments:
--model MODEL, -m MODEL Model selection: 1 for ML only, 2 for ML + BLAST (By default model = 1)
--threshold THRESHOLD, -t THRESHOLD Threshold for classification (can be any value between 0-1 for model = 1 (by default = 0.5) and 0-2 for model = 2 (by default = 0.52))
For help:
-h, --help show this help message and exit
Standalone Minimum Usage
Run the program using:
python3 antioxipred.py -f AOPP.test.2023.fasta -o result
Arguments Description
Input File (-f)
Allows users to provide input peptide sequences in FASTA format.
Output File (-o)
The program saves the prediction results in the specified output folder.
Model (-m)
Users can choose which model to run:
- model = 1 → Runs only the Machine Learning model (CatBoost classifier)
- model = 2 → Runs the Hybrid model (Machine Learning + BLAST)
Default: model = 1
Threshold (-t)
Users can provide a threshold for classification.
- For model = 1 → threshold range:
0 – 1(default =0.50) - For model = 2 → threshold range:
0 – 2(default =0.50)
ANTIOXIPRED Package Files
The package contains the following files:
| File | Description |
|---|---|
INSTALLATION |
Installation instructions |
README.md |
Documentation and usage instructions |
catboost_model_server.pkl |
Pickled CatBoost prediction model |
antioxipred.py |
Main Python script used to run the prediction |
pfeature_comp.py |
Python script used to extract the feature PAAC |
AOPP.test.2023.fasta |
Example FASTA file containing peptide sequences |
blast_db/ |
Database used for BLAST similarity search |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file antioxipred-1.1.tar.gz.
File metadata
- Download URL: antioxipred-1.1.tar.gz
- Upload date:
- Size: 8.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7194ed639fe4e8183bbd170d957518f0386fc233f00bbdeafa793f6fc0571a54
|
|
| MD5 |
572cba886118f02cfbaf380ac2843273
|
|
| BLAKE2b-256 |
2ec8cd6b482cfc3711113f6c152653f9cfe2bacb0e9975c275320e92a6885a59
|
File details
Details for the file antioxipred-1.1-py3-none-any.whl.
File metadata
- Download URL: antioxipred-1.1-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1c584b629c4e1ed8dd3a80007e3e202eff17fec389d8a615b77e2d32266d3bd
|
|
| MD5 |
a3dc19814f3701ee15161a377b0f1318
|
|
| BLAKE2b-256 |
ddfe2feafdcfc30498fb64e173a1c5a1315ecbee9440185f4935dea335831d53
|