Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 pyHMMER Stars

Cython bindings and Python interface to HMMER3.

TravisCI Coverage PyPI Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads DOI

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is maintained by members of the the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python module, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to format your inputs to pass to HMMER without needing to write them to a file. Output retrieval is also done in memory, through instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a Sequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically not slower than directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which helps getting the most of multiple CPUs.

This library is still a work-in-progress, and in a very experimental stage, but it should already pack enough features to run simple biological analyses involving hmmsearch.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A bioconda package is planned when this package exits the alpha status.

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch, and obtain an iterable over TopHits that can be used for further sorting/querying in Python:

import pyhmmer

with pyhmmer.easel.SequenceFile("938293.PRJEB85.HG003687.faa") as file:
    alphabet = file.guess_alphabet()
    sequences = [seq.digitize(alphabet) for seq in file]

with pyhmmer.plan7.HMMFile("Pfam.hmm") as hmms:
    all_hits = list(pyhmmer.hmmsearch(hmms, sequences_file, cpus=4))

Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable. Note that for optimal performance, you should pass the number of physical cores to the cpus argument of the pyhmmer.hmmsearch function, as HMMER requires too many SIMD registers to benefit from hyperthreading.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-8550U CPU running at 1.80GHz, using a FASTA file containing 2100 protein sequences (tests/data/seqs/938293.PRJEB85.HG003687.faa) and a subset of the Pfam HMM library containing 2873 domains. Commands were run 20 times.

Command # CPUs mean (s) σ (ms) min (s) max (s) Speedup
python -m pyhmmer hmmsearch 4 20.706 316 19.960 42.457 x1.00
python -m pyhmmer hmmsearch 2 24.076 842 22.289 21.118 x1.16
hmmsearch 2 35.046 161 34.734 35.183 x1.69
hmmsearch 4 37.721 78 37.605 37.847 x1.82
python -m pyhmmer hmmsearch 1 39.022 1346 36.081 40.644 x1.88
hmmsearch 1 44.360 243 44.184 45.018 x2.14
hmmscan 2 102.248 381 101.479 102.765 x4.93
hmmscan 4 106.779 375 106.197 107.482 x5.15
hmmscan 1 107.945 326 107.460 108.502 x5.21

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

This version

0.1.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.1.2.tar.gz (2.1 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyhmmer-0.1.2-cp39-cp39-manylinux2010_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.12+ x86-64

pyhmmer-0.1.2-cp39-cp39-manylinux1_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.9

pyhmmer-0.1.2-cp38-cp38-manylinux2010_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.12+ x86-64

pyhmmer-0.1.2-cp38-cp38-manylinux1_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.8

pyhmmer-0.1.2-cp37-cp37m-manylinux2010_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.12+ x86-64

pyhmmer-0.1.2-cp37-cp37m-manylinux1_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.7m

pyhmmer-0.1.2-cp36-cp36m-manylinux2010_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.12+ x86-64

pyhmmer-0.1.2-cp36-cp36m-manylinux1_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.6m

File details

Details for the file pyhmmer-0.1.2.tar.gz.

File metadata

  • Download URL: pyhmmer-0.1.2.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c96ca0a58ef3192e142bece26feb484f5d7e82c16a519158cbaf7852fa2b6585
MD5 8318d20a6f1665c16548486d361b2cfd
BLAKE2b-256 cfc7e12e012065479815a016011013d76b3b96f4f7f9af43796343615a3931a6

See more details on using hashes here.

File details

Details for the file pyhmmer-0.1.2-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.2-cp39-cp39-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.9, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.1.2-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 57adfc1940942017ba84dc46ef9d79d940f223632ba4b8757b4fcc876b49fe4e
MD5 6ab653182feac94ec5583a5313dea1d7
BLAKE2b-256 96a46993b2c3f5fbeca7eae8db5be6d295cbda155f167446b39124e6b86d8188

See more details on using hashes here.

File details

Details for the file pyhmmer-0.1.2-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.2-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.9.1

File hashes

Hashes for pyhmmer-0.1.2-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 16e8d69d779d222371bfc71142cdf28a0cebca1e1d04842a9e0577f7dbfc9222
MD5 770ad5d0b2ccff33a8b6e6dc1b407c64
BLAKE2b-256 0f8a0031d0591da9f548894a9c642eea2093a5126607c172653f95b1a9f4e87f

See more details on using hashes here.

File details

Details for the file pyhmmer-0.1.2-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.2-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for pyhmmer-0.1.2-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f943911b45bbb58163cba8e0bb7c64667c04c7d6db457390fdffbd99bcbfa379
MD5 d9939d7be389ee053fa5c1ed856d570e
BLAKE2b-256 40044bdeb1d809ab83f5675c9c9ff0916cec214a8f948963c647af005f6bd38b

See more details on using hashes here.

File details

Details for the file pyhmmer-0.1.2-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.2-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for pyhmmer-0.1.2-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ccc2b1dae7cb9783a587ca81605ceb21ccf56578b07cc77ef74600ad73d7c05f
MD5 21d35da0bed204454b2da4b25337504f
BLAKE2b-256 2b06a78f0a5591fe51e6d2504b7e994192789d3e314c616639ba822384e3feed

See more details on using hashes here.

File details

Details for the file pyhmmer-0.1.2-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.2-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.7.9

File hashes

Hashes for pyhmmer-0.1.2-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2e8b0ffb0d43fee6222dd556b081e0526c61841ae5c7c350858ec5950e3231e4
MD5 43a04378de9eeea114c46e7ca009c001
BLAKE2b-256 808dcdc97ad3ebb6510a15494d4ef7067f021d5e05b13e4044fa0b71be48624d

See more details on using hashes here.

File details

Details for the file pyhmmer-0.1.2-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.2-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.7.9

File hashes

Hashes for pyhmmer-0.1.2-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0cee810e83ad017e493a0e44d6ff4e1d6549fc6f3f81d3879d35cd38e9487e3e
MD5 6eb7437d9c596e9e10d01af8d498b21f
BLAKE2b-256 7779898e135c8e0fd163528e02b3be3c5e17d2ee6891d3cc267c0203315079bd

See more details on using hashes here.

File details

Details for the file pyhmmer-0.1.2-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.2-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.6.12

File hashes

Hashes for pyhmmer-0.1.2-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b19bc5b60ddfeeb7e5f368615ae4bea0d37c6e46f4794968e90031a40dbbd997
MD5 5530b5608b844ea6f61322dd2f67604d
BLAKE2b-256 11599256d8c340f272a37741c1df0213d54f762af188af7096ee64e3c10d8f49

See more details on using hashes here.

File details

Details for the file pyhmmer-0.1.2-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pyhmmer-0.1.2-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.6.12

File hashes

Hashes for pyhmmer-0.1.2-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ab5ee04b680baa1a2135f17a1a8c48fe6d5cc932d5bbc9e73b6ac25cdbba7ca3
MD5 c49d13e0c082d0f8b157edaa62e2727b
BLAKE2b-256 6db3e8ec0b0b7254f9e9fd7a12c4ffa8fc60566d74330d66afcde884637ae5d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page