MHC Binding Predictor
Project description
mhcflurry
MHC I ligand prediction package with competitive accuracy and a fast and documented implementation.
2.3.0 release candidate
2.3.0 is currently a release candidate (2.3.0rc1), not yet a final
release. It keeps the same API and pre-trained models as 2.2.x. Install it by
pinning the version:
pip install mhcflurry==2.3.0rc1
For now, pip install --upgrade mhcflurry still installs the latest stable
release (2.2.x), because pip skips pre-releases unless you pin the version or
pass --pre. Once 2.3.0 is released, pip install --upgrade mhcflurry will
upgrade to it as usual.
2.3.0 adds speed and tooling for people who train their own models or run large prediction jobs:
- Training keeps data on the GPU for the whole fit, avoiding per-batch host/device copies.
mhcflurry-predict,mhcflurry-predict-scan, andmhcflurry-calibrate-percentile-ranksuse all visible GPUs by default.mhcflurry-class1-train-pan-allele-modelsauto-tunes job and worker counts from the hardware, so the same command runs on a laptop, a single GPU, or an 8×A100 host.torch.compile, TF32, and matmul precision are available as flags on the training commands.
Version 2.2.0 switched the neural-network backend from TensorFlow/Keras to PyTorch and added Apple Silicon (MPS) support.
MHCflurry implements class I peptide/MHC binding affinity prediction. The current version provides pan-MHC I predictors supporting any MHC allele of known sequence. MHCflurry runs on Python 3.10+ using the PyTorch neural network library. It exposes command-line and Python library interfaces.
MHCflurry also includes two experimental predictors, an "antigen processing" predictor that attempts to model MHC allele-independent effects such as proteosomal cleavage and a "presentation" predictor that integrates processing predictions with binding affinity predictions to give a composite "presentation score." Both models are trained on mass spec-identified MHC ligands.
If you find MHCflurry useful in your research please cite:
T. O'Donnell, A. Rubinsteyn, U. Laserson. "MHCflurry 2.0: Improved pan-allele prediction of MHC I-presented peptides by incorporating antigen processing," Cell Systems, 2020. https://doi.org/10.1016/j.cels.2020.06.010
T. O'Donnell, A. Rubinsteyn, M. Bonsack, A. B. Riemer, U. Laserson, and J. Hammerbacher, "MHCflurry: Open-Source Class I MHC Binding Affinity Prediction," Cell Systems, 2018. https://doi.org/10.1016/j.cels.2018.05.014
Please file an issue if you have questions or encounter problems.
Have a bugfix or other contribution? We would love your help. See our contributing guidelines.
Try it now
You can generate MHCflurry predictions without any setup by running our Google colaboratory notebook.
Installation (pip)
Install the package:
$ pip install mhcflurry
Download our datasets and trained models:
$ mhcflurry-downloads fetch
You can now generate predictions:
$ mhcflurry-predict \
--alleles HLA-A0201 HLA-A0301 \
--peptides SIINFEKL SIINFEKD SIINFEKQ \
--out /tmp/predictions.csv
Wrote: /tmp/predictions.csv
Or scan protein sequences for potential epitopes:
$ mhcflurry-predict-scan \
--sequences MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHS \
--alleles HLA-A*02:01 \
--out /tmp/predictions.csv
Wrote: /tmp/predictions.csv
Unified mhcflurry parent command
Starting in 2.3.0 there is also a single mhcflurry command that dispatches
to every subcommand:
$ mhcflurry predict \
--alleles HLA-A0201 HLA-A0301 \
--peptides SIINFEKL SIINFEKD SIINFEKQ \
--out /tmp/predictions.csv
$ mhcflurry compare-models \
--a results/new_run/ \
--b public \
--out results/comparison/
$ mhcflurry plot-model-comparison --input results/comparison/
Every historical command is reachable as a subcommand
(mhcflurry-predict ↔ mhcflurry predict, mhcflurry-downloads ↔
mhcflurry downloads, mhcflurry-class1-train-pan-allele-models ↔
mhcflurry class1-train-pan-allele-models, etc.). Both forms run the
same underlying entry point; the legacy mhcflurry-* scripts remain
installed as compat shims and are not changing. mhcflurry --help
lists every available subcommand.
The two new-in-2.3.0 model-comparison tools, compare-models and
plot-model-comparison, only have the unified form.
See the documentation for more details.
Development and tests
From a checkout, source develop.sh to create and activate the editable
environment:
$ source develop.sh
For quick feedback, run lint plus a focused unit subset:
$ ./lint.sh
$ pytest -q test/test_amino_acid.py test/test_random_negative_peptides.py
pytest test/ is the full test suite, not a fast unit-only loop. It includes
small end-to-end training runs, command subprocess tests, public-model smoke
tests that require cached MHCflurry download bundles, and speed/regression
checks, so it can take many minutes. Use
pytest -q test -m "not slow and not downloads" for the broad fast tier, and
pytest -q test --durations=25 when auditing slow tests. See the
testing documentation for
the current test tiers.
Docker
You can also try the latest (GitHub master) version of MHCflurry using the Docker image hosted on Dockerhub by running:
$ docker run -p 9999:9999 --rm openvax/mhcflurry:latest
This will start a jupyter notebook server in an
environment that has MHCflurry installed. Go to http://localhost:9999 in a
browser to use it.
To build the Docker image yourself, from a checkout run:
$ docker build -t mhcflurry:latest .
$ docker run -p 9999:9999 --rm mhcflurry:latest
Predicted sequence motifs
Sequence logos for the binding motifs learned by MHCflurry BA are available here.
Common issues and fixes
Problems downloading data and models
Some users have reported HTTP connection issues when using mhcflurry-downloads fetch. As a workaround, you can download the data manually (e.g. using wget) and then use mhcflurry-downloads just to copy the data to the right place.
To do this, first get the URL(s) of the downloads you need using mhcflurry-downloads url:
$ mhcflurry-downloads url models_class1_presentation
https://github.com/openvax/mhcflurry/releases/download/1.6.0/models_class1_presentation.20200205.tar.bz2```
Then make a directory and download the needed files to this directory:
$ mkdir downloads
$ wget --directory-prefix downloads https://github.com/openvax/mhcflurry/releases/download/1.6.0/models_class1_presentation.20200205.tar.bz2```
HTTP request sent, awaiting response... 200 OK
Length: 72616448 (69M) [application/octet-stream]
Saving to: 'downloads/models_class1_presentation.20200205.tar.bz2'
Now call mhcflurry-downloads fetch with the --already-downloaded-dir option to indicate that the downloads should be retrived from the specified directory:
$ mhcflurry-downloads fetch models_class1_presentation --already-downloaded-dir downloads
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mhcflurry-2.3.0rc2.tar.gz.
File metadata
- Download URL: mhcflurry-2.3.0rc2.tar.gz
- Upload date:
- Size: 430.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
046335bb1860788d54e42784cb1d75a23f75ad93868834ffdab5450e0480b678
|
|
| MD5 |
ac43c230bb68b3a313e2557fbcf96ef3
|
|
| BLAKE2b-256 |
80025fca0eca12fd5ffd415dcf4b6cd497202f0c4401ee48354182e1e012de7b
|
Provenance
The following attestation bundles were made for mhcflurry-2.3.0rc2.tar.gz:
Publisher:
release.yml on openvax/mhcflurry
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mhcflurry-2.3.0rc2.tar.gz -
Subject digest:
046335bb1860788d54e42784cb1d75a23f75ad93868834ffdab5450e0480b678 - Sigstore transparency entry: 1772348638
- Sigstore integration time:
-
Permalink:
openvax/mhcflurry@6bb38e8814567c243660a9eb5e4488fbb04eb360 -
Branch / Tag:
refs/tags/v2.3.0rc2 - Owner: https://github.com/openvax
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@6bb38e8814567c243660a9eb5e4488fbb04eb360 -
Trigger Event:
release
-
Statement type:
File details
Details for the file mhcflurry-2.3.0rc2-py3-none-any.whl.
File metadata
- Download URL: mhcflurry-2.3.0rc2-py3-none-any.whl
- Upload date:
- Size: 331.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
548dbe4d7c98a81ac67a3d95eb4e86e4042f64cc78d37eb32ff26cde854e1c16
|
|
| MD5 |
e0499764aed79bae55f6c76fcad5216b
|
|
| BLAKE2b-256 |
3f9ddd2829eafd8b45eb6dca75e8904977a36de72187624ff5f192daa2bd7f05
|
Provenance
The following attestation bundles were made for mhcflurry-2.3.0rc2-py3-none-any.whl:
Publisher:
release.yml on openvax/mhcflurry
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mhcflurry-2.3.0rc2-py3-none-any.whl -
Subject digest:
548dbe4d7c98a81ac67a3d95eb4e86e4042f64cc78d37eb32ff26cde854e1c16 - Sigstore transparency entry: 1772348702
- Sigstore integration time:
-
Permalink:
openvax/mhcflurry@6bb38e8814567c243660a9eb5e4488fbb04eb360 -
Branch / Tag:
refs/tags/v2.3.0rc2 - Owner: https://github.com/openvax
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@6bb38e8814567c243660a9eb5e4488fbb04eb360 -
Trigger Event:
release
-
Statement type: