In silico mining of encrypted antimicrobial peptides from proteomes
Project description
decryptAMP
Bioinformatics tool for the identification and prediction of encrypted Antimicrobial Peptides (ecAMPs) from proteome data.
██████╗ ███████╗ ██████╗██████╗ ██╗ ██╗██████╗ ████████╗ █████╗ ███╗ ███╗██████╗ ██╔══██╗██╔════╝██╔════╝██╔══██╗╚██╗ ██╔╝██╔══██╗╚══██╔══╝██╔══██╗████╗ ████║██╔══██╗ ██║ ██║█████╗ ██║ ██████╔╝ ╚████╔╝ ██████╔╝ ██║ ███████║██╔████╔██║██████╔╝ ██║ ██║██╔══╝ ██║ ██╔══██╗ ╚██╔╝ ██╔═══╝ ██║ ██╔══██║██║╚██╔╝██║██╔═══╝ ██████╔╝███████╗╚██████╗██║ ██║ ██║ ██║ ██║ ██║ ██║██║ ╚═╝ ██║██║ ╚═════╝ ╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ |
decryptAMP is an end-to-end pipeline that mines proteomes for encrypted antimicrobial peptides (ecAMPs). It performs in silico proteolytic digestion, computes 22 physicochemical and compositional descriptors per peptide, and classifies each peptide using AMPidentifier (a tuned soft-voting ensemble of five base classifiers). All results are saved with a complete provenance manifest (JSON) and a self-contained HTML report.
Table of contents
- Quick start
- Pipeline overview
- Installation
- Usage
- Output layout
- Scientific notes
- AMPidentifier models
- Troubleshooting
- Testing
- Citation
Quick start
pip install decryptamp
# Run on the bundled E. coli K-12 MG1655 demo proteome (4298 proteins)
decryptamp
Outputs land in results/bacteria/:
results/bacteria/
├── encrypted_peptides_results.csv # ecAMP candidates with 22 features + probability
├── encrypted_peptides_results_manifest.json # full provenance (versions, hashes, counts, parameters)
├── encrypted_peptides_results_report.html # human-readable summary
└── encrypted_peptides_results_dedup_stats.txt # deduplication breakdown
results/bacteria.zip # compressed archive of the run directory
Pipeline overview
proteome FASTA
│
▼ in silico digestion (trypsin / chymotrypsin / caspase / pseudoenzyme)
encrypted peptides (8-50 aa, canonical residues only)
│
▼ 22 physicochemical + compositional descriptors (AMPidentifier)
feature matrix
│
▼ exact deduplication (always) + optional CD-HIT clustering
unique encrypted peptides
│
▼ AMPidentifier classifier (voting / rf / svm / gb / xgb / lgbm)
ecAMP candidates above the decision threshold
Installation
PyPI
The recommended way to install decryptAMP. Python ≥ 3.10 is required.
pip install decryptamp
The package includes all AMPidentifier model weights (~63 MB). No additional downloads are needed.
Optional: install cd-hit for sequence-identity deduplication (--dedup-cdhit):
brew install cd-hit # macOS
sudo apt-get install cd-hit # Debian/Ubuntu
conda install -c bioconda cd-hit # conda
Local install from source
Requirements: Python ≥ 3.10, optional cd-hit binary for --dedup-cdhit.
git clone https://github.com/madsondeluna/decryptAMP.git
cd decryptAMP
python -m venv venv
source venv/bin/activate # Linux/macOS (Windows: venv\Scripts\activate)
pip install .
Optional, for CD-HIT clustering:
brew install cd-hit # macOS
sudo apt-get install cd-hit # Debian/Ubuntu
conda install -c bioconda cd-hit # conda
Tested versions
The bundled AMPidentifier weights were generated and validated against the package versions below. pyproject.toml declares minimum constraints for installation flexibility, but if a .pkl fails to deserialize or numeric results drift unexpectedly, pin to these exact versions:
| Package | Version |
|---|---|
| Python | 3.13.7 |
| biopython | 1.86 |
| joblib | 1.5.2 |
| lightgbm | 4.6.0 |
| modlamp | 4.3.2 |
| numpy | 2.3.4 |
| pandas | 2.3.3 |
| scikit-learn | 1.8.0 |
| scipy | 1.16.3 |
| tqdm | 4.67.1 |
| xgboost | 3.2.0 |
| pytest (dev only) | 9.0.3 |
Quick install of the exact tested set:
pip install \
biopython==1.86 joblib==1.5.2 lightgbm==4.6.0 modlamp==4.3.2 \
numpy==2.3.4 pandas==2.3.3 scikit-learn==1.8.0 scipy==1.16.3 \
tqdm==4.67.1 xgboost==3.2.0
Native shell alias (no Docker)
After pip install . inside a virtual environment, the decryptamp entry point is placed at <venv>/bin/decryptamp. To call it from any directory without activating the environment, add an alias pointing to that binary. Adjust the path to where you cloned the repository.
# Linux/macOS (zsh)
echo "alias decryptamp='/abs/path/to/decryptAMP/venv/bin/decryptamp'" >> ~/.zshrc
source ~/.zshrc
# Linux/macOS (bash)
echo "alias decryptamp='/abs/path/to/decryptAMP/venv/bin/decryptamp'" >> ~/.bashrc
source ~/.bashrc
After that, the tool is available everywhere:
cd /any/working/dir
decryptamp --input myproteome.faa --high-discovery-mode
# Output goes to ./results/myproteome/ in the current working directory.
Docker
The bundled Dockerfile is multi-stage, slim, and includes cd-hit and the AMPidentifier model weights.
docker build -t decryptamp .
Run the demo proteome (results land in ./results on the host):
docker run --rm -v "$PWD/results:/work/results" decryptamp
Run on your own proteome (mounted read-only):
docker run --rm \
-v "/abs/path/to/proteomes:/data:ro" \
-v "$PWD/results:/work/results" \
decryptamp --input /data/myproteome.faa --model voting --high-discovery-mode
Pass any decryptAMP flag after the image name; it is forwarded directly to the decryptamp entry point.
Docker shell alias
For an experience identical to the native install, add an alias that bind-mounts the current working directory as /data inside the container. Both the input FASTA and the results/ output directory then resolve transparently to your host CWD.
# Linux/macOS (zsh)
echo "alias decryptamp='docker run --rm -v \"\$PWD:/data\" -w /data decryptamp'" >> ~/.zshrc
source ~/.zshrc
# Linux/macOS (bash)
echo "alias decryptamp='docker run --rm -v \"\$PWD:/data\" -w /data decryptamp'" >> ~/.bashrc
source ~/.bashrc
Use exactly like the native command:
cd /any/working/dir
decryptamp --input myproteome.faa --high-discovery-mode
# Output appears in ./results/myproteome/ on the host.
This alias keeps containers ephemeral (--rm) and produces no Docker-specific footprint in the output directory; files end up owned by your host user on macOS and Linux.
Usage
Command-line interface
usage: decryptamp [-h] [--input FASTA] [--output NAME] [--results-dir DIR]
[--force] [--workers N]
[--enzyme {trypsin,chymotrypsin,caspase,pseudoenzyme}]
[--model {voting,rf,svm,gb,xgb,lgbm}] [--threshold FLOAT]
[--high-discovery-mode] [--no-prediction]
[--dedup-cdhit FLOAT] [--keep-redundant] [--list-thresholds]
Mine encrypted antimicrobial peptides (ecAMPs) from proteome data.
input / output:
--input FASTA proteome FASTA (default: bundled E. coli demo)
--output NAME output CSV name or explicit path
--results-dir DIR parent dir for run outputs (default: results)
--force overwrite the run directory if it exists
--workers N parallel worker processes (default: 8)
digestion:
--enzyme cleavage rule (default: trypsin)
prediction:
--model AMPidentifier model (default: voting)
--threshold FLOAT decision threshold (default: per-model MCC-optimized)
--high-discovery-mode override threshold to 0.9 (high precision)
--no-prediction skip prediction; save all unique peptides with features only
deduplication:
--dedup-cdhit FLOAT optional CD-HIT clustering at this identity (e.g. 0.95)
--keep-redundant also save the pre-deduplication CSV
utilities:
--list-thresholds print per-model MCC thresholds and exit
Run `decryptamp --help` to see the live grouped help in your terminal (with
ANSI colours when stdout is a TTY).
| Flag | Default | Description |
|---|---|---|
--input PATH |
bundled E. coli demo | Input proteome FASTA. Aborts with a clear error if the file looks like nucleotide data (>90% A/C/G/T/U/N). Reports duplicate IDs and suffixes them with __dup1, __dup2, etc., without losing data. |
--output NAME |
encrypted_peptides_results.csv |
Output CSV name. If it has no path separator, the file is placed inside the run directory (see --results-dir). If it contains a path (e.g. /tmp/x.csv), the path is respected literally. |
--results-dir DIR |
results |
Parent directory for run outputs. A subdirectory named after the input filename (without FASTA extension) is created inside. |
--force |
off | Overwrite the run directory if it already exists. Without this flag, decryptAMP aborts with a clear error to prevent accidental data loss. |
--workers N |
os.cpu_count() |
Parallel worker processes for digestion. |
--enzyme {trypsin,chymotrypsin,caspase,pseudoenzyme} |
trypsin |
In silico cleavage rule. See Scientific notes for the regex of each enzyme. |
--model {voting,rf,svm,gb,xgb,lgbm} |
voting |
AMPidentifier model. The voting ensemble (Acc=92.9%, AUC=0.977, MCC=0.859 on validation) is recommended. |
--threshold FLOAT |
per-model MCC-optimized | Decision threshold for ecAMP_Probability. If omitted, uses the AMPidentifier MCC-optimized threshold for the selected model (e.g. 0.56 for voting). |
--high-discovery-mode |
off | Override the threshold with the high-precision discovery setting (0.9). Reduces false positives at the cost of recall. Calibrated for voting; emits a warning when used with other models. Ignored if --threshold is given explicitly. |
--no-prediction |
off | Skip the AMPidentifier prediction step. Saves all unique encrypted peptides with their 22 features only. |
--dedup-cdhit FLOAT |
off | Apply CD-HIT clustering at the given identity threshold (e.g. 0.95) after exact deduplication. Requires the cd-hit binary in PATH. |
--keep-redundant |
off | Also save the pre-deduplication CSV (one row per peptide occurrence) as <output>_redundant.csv. |
--list-thresholds |
off | Print the per-model MCC-optimized threshold table and exit without running the pipeline. |
Examples with bacteria.faa
The bundled demo proteome (bacteria.faa) is a 4298-protein RefSeq proteome of Escherichia coli str. K-12 substr. MG1655. Numbers below are reproducible with the default seeds and the AMPidentifier weights shipped in this repository.
1. Default run (trypsin + voting + MCC threshold)
decryptamp
Run directory: /abs/path/results/bacteria
Selected enzyme for digestion: Trypsin
Loading proteome from: /path/to/decryptamp/example-data/bacteria.faa
Successfully loaded 4298 protein sequences (1330117 aa total).
Organism (consensus): Escherichia coli str. K-12 substr. MG1655
Source database: RefSeq
Computing AMPidentifier features for 257845 peptides...
Deduplicating 257845 encrypted peptides...
Exact dedup: 257845 -> 251756 (2.36% reduction).
Predicting AMP activity with AMPidentifier (VOTING)...
AMPidentifier model loaded: VOTING (threshold=0.56, 22 features).
Found 25784 ecAMPs (out of 251756 unique encrypted peptides) with ecAMP_Probability >= 0.56.
| metric | value |
|---|---|
| Proteins input | 4 298 |
| Encrypted peptides generated | 257 845 |
| After exact deduplication | 251 756 |
| ecAMPs predicted (threshold 0.56) | 25 784 |
| Yield per protein | 6.00 |
| Yield per kb of proteome | 19.39 |
2. High-precision discovery (threshold 0.9)
decryptamp --high-discovery-mode
Use this when downstream synthesis or screening is expensive and you want to triage the highest-confidence candidates only. The voting ensemble shifts from MCC=0.56 to a fixed 0.9 cutoff.
| metric | default (0.56) | --high-discovery-mode (0.9) |
|---|---|---|
| ecAMPs predicted | 25 784 | 2 711 |
| Yield per protein | 6.00 | 0.63 |
| Yield per kb of proteome | 19.39 | 2.04 |
3. Use a single base classifier instead of the ensemble
decryptamp --model rf --threshold 0.7
Available models with their MCC-optimized thresholds:
| Model | MCC-optimized threshold | Notes |
|---|---|---|
voting |
0.56 | Soft-voting ensemble (recommended) |
rf |
0.56 | Random Forest |
svm |
0.47 | Support Vector Machine (RBF) |
gb |
0.55 | Gradient Boosting |
xgb |
0.48 | XGBoost |
lgbm |
0.71 | LightGBM |
4. Try a different enzyme
decryptamp --enzyme chymotrypsin
decryptamp --enzyme caspase # cleaves after D (aspartic acid)
decryptamp --enzyme pseudoenzyme # random control, fixed seed=42
The pseudoenzyme setting generates non-overlapping fragments of length sampled uniformly from [8, 50] using a fixed-seed RNG (seed=42) for reproducibility. It serves as a negative control to demonstrate that biological enzyme cleavage is non-random.
5. Remove near-duplicate peptides with CD-HIT
decryptamp --dedup-cdhit 0.95
After exact deduplication, near-duplicates differing in 1-2 residues (e.g. missed-cleavage variants of the same core) are collapsed at the given identity threshold. Output gains Cluster_ID, Cluster_Size, and Cluster_Members columns. Typical reduction on bacterial proteomes is 60-80% at 0.95 identity.
6. Audit redundancy before deduplication
decryptamp --dedup-cdhit 0.95 --keep-redundant
Adds <output>_redundant.csv with one row per peptide occurrence (before any dedup), useful for tracing each ecAMP back to all source proteins and start positions.
7. Skip prediction (feature-only mode)
decryptamp --no-prediction
Computes the 22 features for every unique encrypted peptide and saves them without filtering. Useful for downstream analyses (PCA, UMAP, clustering, custom classifiers).
8. Override output destination
decryptamp --output /tmp/my_results.csv
When --output contains a path separator, the run directory is not managed automatically. Sibling artifacts (manifest, HTML report, dedup stats) are written next to the CSV.
9. Multiple proteomes side by side
decryptamp --input proteomes/Ecoli.faa
decryptamp --input proteomes/Athaliana.faa
decryptamp --input proteomes/Hsapiens.faa
Each produces its own subdirectory under results/ (Ecoli/, Athaliana/, Hsapiens/), so multiple proteomes coexist without overwriting each other.
10. Override the parent results directory
decryptamp --input data/myproteome.faa --results-dir /scratch/runs --force
Useful in HPC setups where outputs should land outside the working directory.
11. Full feature combination on bacteria.faa
A reference command exercising every flag at once. Useful as a smoke test of a fresh installation.
decryptamp \
--output ecoli_k12_full.csv \
--results-dir results \
--force \
--workers 8 \
--enzyme trypsin \
--model voting \
--high-discovery-mode \
--dedup-cdhit 0.95 \
--keep-redundant
This will generate, inside results/bacteria/:
ecoli_k12_full.csv # high-confidence ecAMPs with 22 features
ecoli_k12_full.fasta # same candidates as FASTA, score in header
ecoli_k12_full_manifest.json # full provenance
ecoli_k12_full_report.html # one-page HTML summary
ecoli_k12_full_dedup_stats.txt # exact + CD-HIT 0.95 breakdown
ecoli_k12_full_redundant.csv # pre-deduplication CSV (one row per occurrence)
Expected (rounded) on the bundled E. coli K-12 MG1655 demo:
| stage | count |
|---|---|
| Input proteins | 4 298 |
| Encrypted peptides generated | 257 845 |
| After exact deduplication | 251 756 |
| After CD-HIT @ 0.95 | ~50-80 thousand |
| ecAMPs (voting + threshold 0.9) | a few hundred to ~1 thousand |
Output layout
By default every run creates results/<input_stem>/:
results/<input_stem>/
├── encrypted_peptides_results.csv # main output, full feature table (always)
├── encrypted_peptides_results.fasta # ecAMP sequences with score in header (always)
├── encrypted_peptides_results_manifest.json # full provenance JSON (always)
├── encrypted_peptides_results_report.html # self-contained HTML report (always)
├── encrypted_peptides_results_dedup_stats.txt # dedup breakdown (always)
├── encrypted_peptides_results_failed.csv # only if any peptide was dropped
└── encrypted_peptides_results_redundant.csv # only if --keep-redundant
results/<input_stem>.zip # compressed archive of the run directory (always)
The FASTA file is ready for downstream tools (alignment, BLAST, structure prediction) and for synthesis ordering. Header format:
>ecAMP_000001 ecAMP_score=0.9876 source=NP_414543.1:682 multiplicity=1 length=11
KLLILARETGR
>ecAMP_000002 ecAMP_score=0.9742 source=NP_414544.1:35 multiplicity=3 length=18
KWKLFKKIEKVGQNVRDG
The main CSV contains, for each ecAMP candidate:
| Column | Meaning |
|---|---|
Peptide |
amino-acid sequence (8-50 aa, canonical residues only) |
Length |
number of residues |
Multiplicity |
number of times this peptide was generated across the proteome |
Source_Proteins |
semicolon-separated list of source protein IDs |
Source_Positions |
parallel list of 1-based start positions |
Cluster_ID |
CD-HIT cluster ID (only if --dedup-cdhit was used) |
Cluster_Size |
number of peptides in the cluster (only with --dedup-cdhit) |
Cluster_Members |
semicolon-separated peptide sequences in the cluster |
Charge, pI, InstabilityInd, ... |
the 22 AMPidentifier features |
ecAMP_Probability |
model probability of being an ecAMP (range 0-1) |
ecAMP_Prediction |
binary call (1 if probability ≥ threshold, else 0) |
The 22 features
| Group | Count | Names |
|---|---|---|
| Global descriptors (modlAMP) | 6 | Charge, pI, InstabilityInd, AliphaticInd, BomanInd, HydrophRatio |
| Hydrophobic moment (modlAMP, Eisenberg, angle 100°) | 1 | HydrophobicMoment |
| Grouped amino-acid composition | 9 | f_acidic, f_basic, f_polar, f_nonpolar, f_aliphatic, f_aromatic, f_charged, f_small, f_tiny |
| Free Energy of Transition local (D1) | 3 | FET_low_D1, FET_mid_D1, FET_high_D1 |
| Solvent accessibility local (D1) | 3 | SA_buried_D1, SA_exposed_D1, SA_inter_D1 |
Charges are computed at pH 7.0 with amide=True (matching the AMPidentifier training convention).
Manifest JSON
Every run writes a complete _manifest.json covering tool version, git commit, full command line, input file SHA-256, proteome organism and source database (extracted from FASTA headers), digestion parameters, feature parameters, deduplication statistics, model SHA-256, decision threshold and its source (mcc-optimized / high-discovery / user-override / deprecated-min-prob), and SHA-256 of every output artifact.
A typical pipeline_summary block:
{
"n_proteins_input": 4298,
"n_encrypted_peptides_generated": 257845,
"n_encrypted_peptides_dropped_nonfinite": 0,
"n_encrypted_peptides_after_exact_dedup": 251756,
"n_encrypted_peptides_after_cdhit": null,
"n_ecamps_predicted": 25784,
"ecamps_yield_per_protein": 5.999069,
"ecamps_yield_per_kb_proteome": 19.386758
}
The manifest is sufficient to bit-identically reproduce the run from the same input.
HTML report
A self-contained HTML page (no JavaScript, no external resources, plain CSS) is written next to every CSV. It renders the manifest as a one-page summary with KPI cards, the proteome → encrypted-peptides → unique → ecAMPs flow, organism and source-database metadata extracted from the FASTA, and tables for every parameter used. Suitable for sharing with collaborators or attaching to a manuscript as supplementary material.
Open with any browser:
open results/bacteria/encrypted_peptides_results_report.html
The CSV and JSON outputs are structured as direct inputs for ecAMPdb, an open database of encrypted antimicrobial peptides covering organisms from all six kingdoms and viruses.
Scientific notes
Canonical residues only. Peptides containing any residue outside the 20 canonical amino acids (A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y) are discarded silently during digestion. This avoids the silent feature bias that arises when ambiguous codes (X, B, Z, J, U, O) are substituted with arbitrary canonical residues.
Enzymatic cleavage rules.
| Enzyme | Regex | Description |
|---|---|---|
trypsin |
(?<=[RK])(?!P) |
Cleaves after R or K, not before P |
chymotrypsin |
(?<=[FWY])(?!P) |
Cleaves after F, W, Y, not before P |
caspase |
(?<=D) |
Cleaves after any D (aspartic acid) |
pseudoenzyme |
random, seed=42 | Negative control: uniform random fragmentation |
Length filter. Generated peptides are kept only if 8 ≤ length ≤ 50 (configurable in src/decryptamp/config.py).
Missed cleavages. Up to 2 missed cleavages allowed by default (configurable in src/decryptamp/config.py).
Charge calculation. pH=7.0, amide=True. The amidation flag matches the AMPidentifier training convention; many natural AMPs (defensins, magainins, cecropins) are C-terminally amidated in vivo, which adds +1 to the net charge.
Hydrophobic moment. Computed with the Eisenberg scale and a 100° angle (canonical α-helix amphipathicity).
Failure handling. Peptides whose feature vector contains any NaN or Inf value are dropped before classification and logged to <output>_failed.csv. The classifier itself raises ValueError if NaN/Inf reaches it (defense in depth). Zero-vectors are never silently fed to the model.
Reproducibility. All randomness is seeded (pseudoenzyme: 42). Per-model MCC-optimized thresholds are loaded from src/ampidentifier/models/threshold_<model>.txt. The manifest records SHA-256 of input, model file, and output CSV.
AMPidentifier models
The classifier is the bundled AMPidentifier (vendored under src/ampidentifier/). The voting ensemble is a soft average of five base learners, each tuned via 5-fold StratifiedKFold and RandomizedSearchCV (n_iter=50, scoring='roc_auc').
| Model | Accuracy | AUC-ROC | MCC | Notes |
|---|---|---|---|---|
| Voting (default) | 92.9% | 0.977 | 0.859 | Soft-voting ensemble of the five below |
| Random Forest | 91.9% | 0.972 | 0.839 | |
| Support Vector Machine (RBF) | 91.9% | 0.969 | 0.839 | Uses StandardScaler |
| Gradient Boosting | 92.0% | 0.974 | 0.839 | |
| XGBoost | 92.2% | 0.974 | 0.843 | |
| LightGBM | 92.7% | 0.975 | 0.855 |
Metrics computed on a 20% holdout of the AMPidentifier training set (13 246 peptides total, balanced 6 623 AMP / 6 623 non-AMP).
Troubleshooting
Error: 'X.faa' looks like a nucleotide sequence — The input FASTA contains too many A/C/G/T/U/N residues to be a protein. Translate it first (e.g. Prodigal, six-frame translation) or pass a protein FASTA.
Error: run directory '...' already exists and is not empty — Pass --force to overwrite, --results-dir to write elsewhere, or --output PATH (with separators) to fully override.
cd-hit binary not found in PATH — Install CD-HIT (brew install cd-hit, apt-get install cd-hit, conda install -c bioconda cd-hit) or omit --dedup-cdhit.
AmpPredictor received N rows with NaN/Inf in feature columns — A feature calculation produced non-finite values for some peptides. The orchestrator should have dropped them upstream; this error indicates a bug. Check <output>_failed.csv for context and please open an issue.
Warning: --high-discovery-mode applies a fixed threshold of 0.9 calibrated for the voting ensemble — You combined --high-discovery-mode with a non-voting model. The 0.9 cutoff is calibrated for voting; per-model probability distributions differ. For per-model calibrated cutoffs use --threshold explicitly.
Sklearn version warning when loading models — The bundled .pkl files were trained with scikit-learn ≥ 1.8.0. Older versions still load but may produce slightly different numeric results in edge cases. pip install --upgrade scikit-learn to silence.
Testing
A pytest suite covers the scientific contract of the digestion module, the AMP classifier input validation, and the manifest schema. The default invocation runs only the fast unit tests; opt-in flags expand coverage.
pip install ".[dev]" # only needs pytest
# Default: fast unit tests, no model loading (~3 s, 49 tests)
pytest
# Add the slow tests that load the AMPidentifier weights (~30 s)
pytest --run-slow
# Full suite, including end-to-end runs against bacteria.faa (~10 min)
pytest --run-all
Test layout:
| File | Coverage | Marker |
|---|---|---|
tests/test_peptide_processor.py |
enzyme regexes (trypsin/chymotrypsin/caspase), canonical-AA filter, pseudoenzyme determinism, missed cleavages, 1-based positions | none (fast) |
tests/test_amp_predictor.py |
NaN/Inf input validation, missing feature columns, MCC threshold values per model | mostly fast; model-loading tests marked @slow |
tests/test_manifest.py |
JSON schema completeness, SHA-256 validity, --no-prediction handling |
none (fast) |
Citation
If decryptAMP supports your research, please cite:
Luna-Aragão, M. A., da Silva, R. L., Santos, D. E., Pacífico, J., & Benko-Iseppon, A. M. decryptAMP: A bioinformatics tool for the identification and prediction of encrypted Antimicrobial Peptides (ecAMPs) from proteome data.
Repository: https://github.com/madsondeluna/decryptAMP
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file decryptamp-2.1.0.tar.gz.
File metadata
- Download URL: decryptamp-2.1.0.tar.gz
- Upload date:
- Size: 17.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6a35babc6297d425b96f22581735f48ca0d6942f0046759bfbff168f4f78cde
|
|
| MD5 |
ce09b971ea458afb5f78db17395365d5
|
|
| BLAKE2b-256 |
cc602f830755d31c2346933dcd733f76d5f16eb669292e28a63a6fdb07a142b7
|
File details
Details for the file decryptamp-2.1.0-py3-none-any.whl.
File metadata
- Download URL: decryptamp-2.1.0-py3-none-any.whl
- Upload date:
- Size: 17.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad69ce63f37ab8a1f30c34493b4a77ce3d70895efc7800d7dcca09fb6ef50c95
|
|
| MD5 |
64e09a8eb5e5708423af7b7998e891d7
|
|
| BLAKE2b-256 |
e941c3a9d83105a11a444f9912f56b40b7467da2472612c9ec4fb7b5607cf709
|