Skip to main content

MHC Binding Predictor

Project description

Build Status Coverage Status Open In Colab

mhcflurry

MHC I ligand prediction package with competitive accuracy and a fast and documented implementation.

If you find MHCflurry useful in your research please cite:

T. O'Donnell, A. Rubinsteyn, U. Laserson. "MHCflurry 2.0: Improved pan-allele prediction of MHC I-presented peptides by incorporating antigen processing," Cell Systems, 2020. https://doi.org/10.1016/j.cels.2020.06.010

T. O'Donnell, A. Rubinsteyn, M. Bonsack, A. B. Riemer, U. Laserson, and J. Hammerbacher, "MHCflurry: Open-Source Class I MHC Binding Affinity Prediction," Cell Systems, 2018. https://doi.org/10.1016/j.cels.2018.05.014

Please file an issue if you have questions or encounter problems.

Have a bugfix or other contribution? We would love your help. See our contributing guidelines.

2.3.0 release candidate

2.3.0 is currently a release candidate (2.3.0rc3), not yet a final release. It keeps the same API and pre-trained models as 2.2.x. Install it by pinning the version:

pip install mhcflurry==2.3.0rc3

For now, pip install --upgrade mhcflurry still installs the latest stable release (2.2.x), because pip skips pre-releases unless you pin the version or pass --pre. Once 2.3.0 is released, pip install --upgrade mhcflurry will upgrade to it as usual.

2.3.0 adds speed and tooling for people who train their own models or run large prediction jobs:

  • Training keeps data on the GPU for the whole fit, avoiding per-batch host/device copies.
  • mhcflurry-predict, mhcflurry-predict-scan, and mhcflurry-calibrate-percentile-ranks use all visible GPUs by default.
  • mhcflurry-class1-train-pan-allele-models auto-tunes job and worker counts from the hardware, so the same command runs on a laptop, a single GPU, or an 8×A100 host.
  • torch.compile and matmul precision (including TF32) are available as flags on the training commands.

Try it now

You can generate MHCflurry predictions without any setup by running our Google colaboratory notebook.

Installation (pip)

Install the package:

$ pip install mhcflurry

Download our datasets and trained models:

$ mhcflurry-downloads fetch

You can now generate predictions:

$ mhcflurry-predict \
       --alleles HLA-A0201 HLA-A0301 \
       --peptides SIINFEKL SIINFEKD SIINFEKQ \
       --out /tmp/predictions.csv

Wrote: /tmp/predictions.csv

Or scan protein sequences for potential epitopes:

$ mhcflurry-predict-scan \
        --sequences MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHS \
        --alleles 'HLA-A*02:01' \
        --out /tmp/predictions.csv

Wrote: /tmp/predictions.csv

Unified mhcflurry parent command

Starting in 2.3.0 there is also a single mhcflurry command that dispatches to every subcommand:

$ mhcflurry predict \
        --alleles HLA-A0201 HLA-A0301 \
        --peptides SIINFEKL SIINFEKD SIINFEKQ \
        --out /tmp/predictions.csv

Every historical command is reachable as a subcommand (mhcflurry-predictmhcflurry predict, mhcflurry-downloadsmhcflurry downloads, mhcflurry-class1-train-pan-allele-modelsmhcflurry class1-train-pan-allele-models, etc.). Both forms run the same underlying entry point; the legacy mhcflurry-* scripts remain installed as compat shims and are not changing. mhcflurry --help lists every available subcommand.

See the documentation for more details.

Development and tests

From a checkout, source develop.sh to create and activate the editable environment:

$ source develop.sh

For quick feedback, run lint plus a focused unit subset:

$ ./lint.sh
$ pytest -q test/test_amino_acid.py test/test_random_negative_peptides.py

pytest test/ is the full test suite, not a fast unit-only loop. It includes small end-to-end training runs, command subprocess tests, public-model smoke tests that require cached MHCflurry download bundles, and speed/regression checks, so it can take many minutes. Use pytest -q test -m "not slow and not downloads" for the broad fast tier, and pytest -q test --durations=25 when auditing slow tests. See the testing documentation for the current test tiers.

Docker

You can also try the latest (GitHub master) version of MHCflurry using the Docker image hosted on Dockerhub by running:

$ docker run -p 9999:9999 --rm openvax/mhcflurry:latest

This will start a jupyter notebook server in an environment that has MHCflurry installed. Go to http://localhost:9999 in a browser to use it.

To build the Docker image yourself, from a checkout run:

$ docker build -t mhcflurry:latest .
$ docker run -p 9999:9999 --rm mhcflurry:latest

Predicted sequence motifs

Sequence logos for the binding motifs learned by MHCflurry BA are available here.

Common issues and fixes

Problems downloading data and models

Some users have reported HTTP connection issues when using mhcflurry-downloads fetch. As a workaround, you can download the data manually (e.g. using wget) and then use mhcflurry-downloads just to copy the data to the right place.

To do this, first get the URL(s) of the downloads you need using mhcflurry-downloads url:

$ mhcflurry-downloads url models_class1_presentation
https://github.com/openvax/mhcflurry/releases/download/1.6.0/models_class1_presentation.20200205.tar.bz2```

Then make a directory and download the needed files to this directory:

$ mkdir downloads
$ wget  --directory-prefix downloads https://github.com/openvax/mhcflurry/releases/download/1.6.0/models_class1_presentation.20200205.tar.bz2```

HTTP request sent, awaiting response... 200 OK
Length: 72616448 (69M) [application/octet-stream]
Saving to: 'downloads/models_class1_presentation.20200205.tar.bz2'

Now call mhcflurry-downloads fetch with the --already-downloaded-dir option to indicate that the downloads should be retrived from the specified directory:

$ mhcflurry-downloads fetch models_class1_presentation --already-downloaded-dir downloads

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mhcflurry-2.3.0rc3.tar.gz (429.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mhcflurry-2.3.0rc3-py3-none-any.whl (331.2 kB view details)

Uploaded Python 3

File details

Details for the file mhcflurry-2.3.0rc3.tar.gz.

File metadata

  • Download URL: mhcflurry-2.3.0rc3.tar.gz
  • Upload date:
  • Size: 429.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mhcflurry-2.3.0rc3.tar.gz
Algorithm Hash digest
SHA256 14970db795350342cdb1840b5bdfa67d5c2d01b87ddf5b1ac92dc4b5457f6f16
MD5 006a9cb38db5db4b924037d411ab310e
BLAKE2b-256 adf6b2cde64f2146591b921c6f1cb0f4022443ac2923939712753c1f608baf2a

See more details on using hashes here.

Provenance

The following attestation bundles were made for mhcflurry-2.3.0rc3.tar.gz:

Publisher: release.yml on openvax/mhcflurry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mhcflurry-2.3.0rc3-py3-none-any.whl.

File metadata

  • Download URL: mhcflurry-2.3.0rc3-py3-none-any.whl
  • Upload date:
  • Size: 331.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mhcflurry-2.3.0rc3-py3-none-any.whl
Algorithm Hash digest
SHA256 61386a6539bebebc53315e302f55548f0296984335cd1a8670b23586f505a4ea
MD5 aa5429e1fe9bea08fcf5abf6d3f2fa7a
BLAKE2b-256 a7c9a3f0e1fafc58a6cfb1dbb6d8ae9d46e37c34623b5ab22dafeb77bf4d2a04

See more details on using hashes here.

Provenance

The following attestation bundles were made for mhcflurry-2.3.0rc3-py3-none-any.whl:

Publisher: release.yml on openvax/mhcflurry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page