Mutant peptide ranking for personalized cancer vaccines
Project description
vaxrank
Selection of mutated protein fragments for therapeutic personalized cancer vaccines.
Usage
vaxrank \
--vcf tests/data/b16.f10/b16.vcf \
--bam tests/data/b16.f10/b16.combined.bam \
--vaccine-peptide-length 25 \
--mhc-predictor netmhc \
--mhc-alleles H2-Kb,H2-Db \
--padding-around-mutation 5 \
--output-ascii-report vaccine-peptides.txt \
--output-pdf-report vaccine-peptides.pdf \
--output-html-report vaccine-peptides.html
Using a YAML Configuration File
You can specify common parameters in a YAML configuration file to avoid repeating them on every run:
vaxrank --config my_config.yaml --vcf variants.vcf --bam tumor.bam
Example my_config.yaml:
epitopes:
min_score: 0.00001 # drop epitopes below this score
scoring_mode: affinity # "affinity" or "percentile_rank"
logistic_midpoint: 350.0 # IC50 (nM) at which score = 0.5
logistic_width: 150.0 # steepness of logistic curve
affinity_cutoff: 5000.0 # IC50 >= this → score 0
percentile_rank_cutoff: 10.0 # rank >= this → score 0 (percentile mode)
top_epitopes_per_candidate: 1000 # 0 = keep all
vaccine_peptides:
preferred_length: 25 # target amino acids per vaccine peptide
min_length: 25 # minimum vaccine peptide length
max_length: 25 # maximum vaccine peptide length
padding_around_mutation: 5 # off-centre windows to consider
per_mutation: 1 # peptides to keep per variant
max_epitopes_per_candidate: 1000 # 0 = keep all
score_fraction_of_best: 0.99 # drop candidates scoring < 99% of best
manufacturability: # GRAVY = mean hydropathy
max_c_terminal_hydropathy: 1.5 # max GRAVY of C-terminal 7-mer
min_kmer_hydropathy: 0.0 # min max-7mer GRAVY (floor)
max_kmer_hydropathy_low_priority: 1.5 # low-priority max-7mer GRAVY cap
max_kmer_hydropathy_high_priority: 2.5 # high-priority max-7mer GRAVY cap
CLI arguments override values from the config file. You can also use
--config-value to override any config value without editing the file:
vaxrank --config my_config.yaml \
--config-value vaccine_peptides.score_fraction_of_best=0.95 \
--config-value epitopes.percentile_rank_cutoff=5.0
Use --config-text when the right-hand side should be kept as a raw
string instead of being YAML-parsed.
Installation
Vaxrank can be installed using pip:
pip install vaxrank
Requirements: Python 3.9+
Note: to generate PDF reports, you first need to install wkhtmltopdf, which you can do (on macOS) like so:
brew install --cask wkhtmltopdf
Vaxrank uses PyEnsembl for accessing information about the reference genome. You must install an Ensembl release corresponding to the reference genome associated with the mutations provided to Vaxrank.
Example for GRCh38 (adjust release to match your reference):
pyensembl install --release 113 --species human
Example for GRCh37 (legacy):
pyensembl install --release 75 --species human
If your variants were called from alignments against hg19 then you can still use GRCh37 but should ignore mitochondrial variants.
Features
Reference Proteome Filtering
Vaxrank filters out peptides that exist in the reference proteome to focus on truly novel mutant sequences. This uses a set-based kmer index for O(1) membership testing. The index is built once and cached locally for subsequent runs.
Cancer Hotspot Annotation
Vaxrank annotates variants that occur at known cancer mutation hotspots using bundled data from cancerhotspots.org (Chang et al. 2016, 2017). This helps identify clinically relevant mutations. The hotspot data includes ~2,700 recurrently mutated positions across cancer types.
MHC Binding Prediction
Vaxrank integrates with MHC binding predictors via mhctools. Use --mhc-predictor <name> to select one:
--mhc-predictor |
Tool | MHC Class | Notes |
|---|---|---|---|
mhcflurry |
MHCflurry | I | Open-source neural network; installed with mhctools |
bigmhc |
BigMHC | I | Auto-detects EL or IM model |
bigmhc-el |
BigMHC EL | I | Presentation (eluted ligand) model |
bigmhc-im |
BigMHC IM | I | Immunogenicity model |
pepsickle |
Pepsickle | I | Proteasomal cleavage predictor |
netmhc |
NetMHC | I | Auto-detects NetMHC3 or NetMHC4 |
netmhc3 |
NetMHC 3.x | I | Requires local install |
netmhc4 |
NetMHC 4.0 | I | Requires local install |
netmhcpan |
NetMHCpan | I | Auto-detects installed version |
netmhcpan28 |
NetMHCpan 2.8 | I | Requires local install |
netmhcpan3 |
NetMHCpan 3.x | I | Requires local install |
netmhcpan4 |
NetMHCpan 4.0 | I | Default mode (EL + BA) |
netmhcpan4-ba |
NetMHCpan 4.0 | I | Binding affinity mode only |
netmhcpan4-el |
NetMHCpan 4.0 | I | Eluted ligand mode only |
netmhcpan41 |
NetMHCpan 4.1 | I | Default mode (EL + BA) |
netmhcpan41-ba |
NetMHCpan 4.1 | I | Binding affinity mode only |
netmhcpan41-el |
NetMHCpan 4.1 | I | Eluted ligand mode only |
netmhcpan42 |
NetMHCpan 4.2 | I | Default mode (EL + BA) |
netmhcpan42-ba |
NetMHCpan 4.2 | I | Binding affinity mode only |
netmhcpan42-el |
NetMHCpan 4.2 | I | Eluted ligand mode only |
netmhccons |
NetMHCcons | I | Requires local install |
netmhcstabpan |
NetMHCstabpan | I | Stability predictor; requires local install |
netchop |
NetChop | -- | Proteasomal cleavage predictor |
netmhciipan |
NetMHCIIpan | II | Auto-detects installed version |
netmhciipan3 |
NetMHCIIpan 3.x | II | Requires local install |
netmhciipan4 |
NetMHCIIpan 4.0 | II | Default mode (EL + BA) |
netmhciipan4-ba |
NetMHCIIpan 4.0 | II | Binding affinity mode only |
netmhciipan4-el |
NetMHCIIpan 4.0 | II | Eluted ligand mode only |
netmhciipan43 |
NetMHCIIpan 4.3 | II | Default mode (EL + BA) |
netmhciipan43-ba |
NetMHCIIpan 4.3 | II | Binding affinity mode only |
netmhciipan43-el |
NetMHCIIpan 4.3 | II | Eluted ligand mode only |
mixmhcpred |
MixMHCpred | I | Requires local install |
netmhcpan-iedb |
NetMHCpan via IEDB | I | Uses IEDB web API |
netmhccons-iedb |
NetMHCcons via IEDB | I | Uses IEDB web API |
netmhciipan-iedb |
NetMHCIIpan via IEDB | II | Uses IEDB web API |
smm-iedb |
SMM via IEDB | I | Uses IEDB web API |
smm-pmbec-iedb |
SMM-PMBEC via IEDB | I | Uses IEDB web API |
random |
Random | -- | Returns random scores; for testing only |
Paper & Citation
The original Vaxrank paper describes an earlier version of the software. The current codebase has been substantially rewritten since publication (updated configuration system, reference proteome filtering, cancer hotspot annotation, expanded predictor support, etc.), but the core algorithm for selecting neoantigen vaccine peptides remains the same.
Vaxrank: A Computational Tool For Designing Personalized Cancer Vaccines can be cited as:
@article {Rubinsteyn142919,
author = {Rubinsteyn, Alex and Hodes, Isaac and Kodysh, Julia and Hammerbacher, Jeffrey},
title = {Vaxrank: A Computational Tool For Designing Personalized Cancer Vaccines},
year = {2017},
doi = {10.1101/142919},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Therapeutic vaccines targeting mutant tumor antigens ({\textquotedblleft}neoantigens{\textquotedblright}) are an increasingly popular form of personalized cancer immunotherapy. Vaxrank is a computational tool for selecting neoantigen vaccine peptides from tumor mutations, tumor RNA data, and patient HLA type. Vaxrank is freely available at www.github.com/hammerlab/vaxrank under the Apache 2.0 open source license and can also be installed from the Python Package Index.},
URL = {https://www.biorxiv.org/content/early/2017/05/27/142919},
eprint = {https://www.biorxiv.org/content/early/2017/05/27/142919.full.pdf},
journal = {bioRxiv}
}
Development
To install Vaxrank for local development:
git clone git@github.com:openvax/vaxrank.git
cd vaxrank
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -e .
# Examples; adjust release to match your reference
pyensembl install --release 113 --species human
pyensembl install --release 113 --species mouse
Run linting and tests:
./lint.sh && ./test.sh
The first run of the tests may take a while to build the reference proteome kmer index, but subsequent runs will use the cached index.
Architecture
Configuration
Vaxrank uses msgspec frozen Struct objects
for configuration, with all defaults centralised in
vaxrank/config/defaults.py. Config values are resolved in order:
- Compiled-in defaults
- YAML config file (
--config) --config-value/--config-textCLI overrides- Dedicated CLI flags (e.g.
--vaccine-peptide-length)
EpitopeConfig — epitope scoring and filtering
| Field | Default | Description |
|---|---|---|
logistic_epitope_score_midpoint |
350.0 | IC50 (nM) at which epitope score = 0.5 |
logistic_epitope_score_width |
150.0 | Steepness of logistic scoring curve |
min_epitope_score |
0.00001 | Epitopes scoring below this are dropped |
binding_affinity_cutoff |
5000.0 | IC50 >= this → score 0 |
scoring_mode |
"affinity" |
"affinity" (IC50-based) or "percentile_rank" |
percentile_rank_cutoff |
10.0 | Rank >= this → score 0 (percentile mode) |
VaccineConfig — peptide assembly and manufacturability
| Field | Default | Description |
|---|---|---|
preferred_peptide_length |
25 | Preferred amino acids per vaccine peptide |
min_peptide_length |
25 | Minimum vaccine peptide length |
max_peptide_length |
25 | Maximum vaccine peptide length |
padding_around_mutation |
5 | Off-centre window positions to consider |
max_vaccine_peptides_per_variant |
1 | Peptides to keep per variant |
num_mutant_epitopes_to_keep |
1000 | Max epitope predictions per peptide (0 = all) |
score_fraction_of_best |
0.99 | Drop candidates scoring below this fraction of the best |
max_c_terminal_hydropathy |
1.5 | Max GRAVY score of the C-terminal 7-mer |
min_kmer_hydropathy |
0.0 | Minimum max-7mer GRAVY (floor) |
max_kmer_hydropathy_low_priority |
1.5 | Low-priority max-7mer GRAVY cap |
max_kmer_hydropathy_high_priority |
2.5 | High-priority max-7mer GRAVY cap |
The four *_hydropathy* fields control the manufacturability tie-breaking
in vaccine peptide ranking. See VaccinePeptide.peptide_synthesis_difficulty_score_tuple
for details on how each threshold is applied.
Key Modules
reference_proteome.py: Set-based kmer index for checking if peptides exist in the reference proteomecancer_hotspots.py: Lookup for known cancer mutation hotspotsepitope_logic.py: Epitope scoring and filtering logiccore_logic.py: Main vaccine peptide selection algorithmreport.py: Report generation (ASCII, HTML, PDF, XLSX)
Dependencies
Key dependencies:
pyensembl: Reference genome annotationvarcode: Variant effect predictionisovar: RNA-based variant callingmhctools: MHC binding predictionmsgspec: Configuration serialization (YAML/JSON)pandas,numpy: Data processingjinja2,pdfkit: Report generation
Scripts
Helper scripts included in the repo:
develop.sh: installs the package in editable mode and setsPYTHONPATHto the repo root.lint.sh: runs ruff onvaxrankandtests.test.sh: runs pytest with coverage.deploy.sh: runs lint/tests, builds a distribution withbuild, uploads viatwine, and tags the release (vX.Y.Z). Deploy is restricted to themain/masterbranch.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vaxrank-2.0.2.tar.gz.
File metadata
- Download URL: vaxrank-2.0.2.tar.gz
- Upload date:
- Size: 130.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1f2a1fb29dd8679aea51575f0a6079dd27291d003c1f68ec029e9ae62a8a590
|
|
| MD5 |
b25b3f0540c33ee20bfe7cc7193cae68
|
|
| BLAKE2b-256 |
fae9e56eeb89c268174f59a48726084ec7ce50e8228a7f18cadc0ae3cdbd060a
|
File details
Details for the file vaxrank-2.0.2-py3-none-any.whl.
File metadata
- Download URL: vaxrank-2.0.2-py3-none-any.whl
- Upload date:
- Size: 117.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54a9c780670df3d7f844e0e6d5af1de2182b23d91a8bb3a298fdce506ee91a96
|
|
| MD5 |
ef12739dab0827474d46b6113cb7d6c5
|
|
| BLAKE2b-256 |
ea6a8809cd68a9fbdd0431aa764a521f6cbd4e57a7291ce0610bddd11d9a9af9
|