AmyloDeep: pLM-based ensemble model for predicting amyloid propensity from the amino acid sequence
Project description
AmyloDeep
AmyloDeep: pLM-based ensemble model for predicting amyloid propensity from the amino acid sequence
AmyloDeep is a Python package that uses ensemble model to predict amyloidogenic regions in protein sequences using a rolling window approach.
Features
- Multi-model ensemble: Combines 5 different models for robust predictions
- Rolling window analysis: Analyzes sequences using sliding windows of configurable size
- Easy-to-use API: Simple Python interface and command-line tool
- Web interface: Light version of tool is available at amylodeep.com
Installation
From PyPI (recommended)
pip install amylodeep
From source
git clone https://github.com/AlisaDavtyan/protein_classification.git
cd amylodeep
pip install amylodeep
Quick Start
from amylodeep import predict_ensemble_rolling
# Predict amyloid propensity for a protein sequence
sequence = "MKTFFFLLLLFTIGFCYVQFSKLKLENLHFKDNSEGLKNGGLQRQLGLTLKFNSNSLHHTSNL"
window_size = 10
result = predict_ensemble_rolling(sequence, window_size = window_size)
print(f"Average probability: {result['avg_probability']:.4f}")
print(f"Maximum probability: {result['max_probability']:.4f}")
# Access position-wise probabilities
for position, probability in result['position_probs']:
print(f"Position {position}: {probability:.4f}")
# Plot probabilities
import numpy as np
import matplotlib.pyplot as plt
positions, probs = zip(*result['position_probs'])
x = np.arange(0, len(sequence) - window_size + 1)
bar_colors = [
(0, 0, 1, 0.8) if p > 0.8 else (0, 0, 1, 0.6) if p > 0.5 else (0, 0, 1, 0.2) for p in probs
]
fig, ax = plt.subplots(figsize=(10, 5))
ax.bar(x, probs, color=bar_colors, width=1, edgecolor="black")
ax.set_ylabel("Probability", fontsize=12)
ax.set_xlabel("Residue", fontsize=12)
L = len(sequence)
ax.set_xlim(-1, L - window_size + 1)
if L < 100:
ax.set_xticks(np.arange(0, L+1, 5))
else:
step = int(np.ceil(L/5/10) * 10)
tick_labels = np.arange(0, L+1, step)
tick_positions = np.minimum(tick_labels, L - window_size)
ax.set_xticks(tick_positions)
ax.set_xticklabels([str(t) for t in tick_labels])
ax.axhline(y=0.5, color='green', linestyle='--', alpha=0.7)
ax.axhline(y=0.8, color='red', linestyle='--', alpha=0.7)
ax.tick_params(axis='both', labelsize=12)
ax.set_title('Amyloidogenicity probability per window ', fontsize=16)
Command Line Interface
# Basic prediction
amylodeep "MKTFFFLLLLFTIGFCYVQFSKLKLENLHFKDNSEGLKNGGLQRQLGLTLKFNSNSLHHTSNL"
# With custom window size
amylodeep "SEQUENCE" --window-size 10
# Save results to file
amylodeep "SEQUENCE" --output results.json --format json
# CSV output
amylodeep "SEQUENCE" --output results.csv --format csv
Requirements
- torch>=1.12.0
- transformers>=4.30.0
- huggingface_hub>=0.14.0
- xgboost>=1.7.0
- numpy>=1.20
- pandas>=1.3
- scikit-learn>=1.0
- jax-unirep>=2.0.0
Main Functions
predict_ensemble_rolling(sequence, window_size=10)
Predict amyloid propensity for a protein sequence using rolling window analysis.
Parameters:
sequence(str): Protein sequence (amino acid letters)window_size(int): Size of the rolling window (default: 10)
Returns: Dictionary containing:
position_probs: List of (position, probability) tuplesavg_probability: Average probability across all windowsmax_probability: Maximum probability across all windowssequence_length: Length of the input sequencenum_windows: Number of windows analyzed
This project is licensed under the MIT License - see the LICENSE file for details.
Citation
If you use AmyloDeep in your research, please cite:
@software{amylodeep2025,
title={AmyloDeep: pLM-based ensemble model for predicting amyloid propensity from the amino acid sequence},
author={Alisa Davtyan, Anahit Khachatryan, Rafayel Petrosyan},
year={2025},
url={https://github.com/AlisaDavtyan/protein_classification}
}
Support
For questions and support:
- Open an issue on GitHub
- Contact: alisadavtyan7@gmail.com , xachatryan96an@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file amylodeep-0.3.1.tar.gz.
File metadata
- Download URL: amylodeep-0.3.1.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e312cbbd338e0907d8257a764953e0341092d1e472229c7ed5d8de66b218635
|
|
| MD5 |
3fd4ffb4649eb07be2df07444d4123e5
|
|
| BLAKE2b-256 |
4a16323424511e27c8f11cb9ac5c5be88d6b086cb45c670e01ee23d52e6abf3b
|
File details
Details for the file amylodeep-0.3.1-py3-none-any.whl.
File metadata
- Download URL: amylodeep-0.3.1-py3-none-any.whl
- Upload date:
- Size: 14.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c8898baf3b679139138966432a86513fc85eec22905a911b2301046a6dce00e
|
|
| MD5 |
08dd38059ac5886171274ee855c768d0
|
|
| BLAKE2b-256 |
896fe15fb1ce70e71dbe74bd8a3ce594c4d2b7f2b547304f6a0bd44150fb139f
|