Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 PyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda AUR Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads DOI

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is developed and maintained by the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python package, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a DigitalSequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically as fast as directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which can help getting the most of multiple CPUs when annotating smaller sequence databases.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses or workflows involving hmmsearch, hmmscan, nhmmer, phmmer, hmmbuild and hmmalign.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available:

$ conda install -c bioconda pyhmmer

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch, and obtain an iterable over TopHits that can be used for further sorting/querying in Python. Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

import pyhmmer

with pyhmmer.easel.SequenceFile("pyhmmer/tests/data/seqs/938293.PRJEB85.HG003687.faa", digital=True) as seq_file:
    sequences = list(seq_file)

with pyhmmer.plan7.HMMFile("pyhmmer/tests/data/hmms/txt/t2pks.hmm") as hmm_file:
    for hits in pyhmmer.hmmsearch(hmm_file, sequences, cpus=4):
      print(f"HMM {hits.query_name.decode()} found {len(hits)} hits in the target sequences")

Have a look at more in-depth examples such as building a HMM from an alignment, analysing the active site of a hit, or fetching marker genes from a genome in the Examples page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running @1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 4,489 protein sequences extracted from the genome of Escherichia coli (562.PRJEB4685) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 3 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

To read more about how PyHMMER achieves better parallelism than HMMER for many-to-many searches, have a look at the Performance page of the documentation.

🔍 See Also

Building a HMM from scratch? Then you may be interested in the pyfamsa package, providing bindings to FAMSA, a very fast multiple sequence aligner. In addition, you may want to trim alignments: in that case, consider pytrimal, which wraps trimAl 2.0.

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You can instead check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.7.2.tar.gz (11.0 MB view details)

Uploaded Source

Built Distributions

pyhmmer-0.7.2-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (10.5 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.7.2-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (10.5 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.7.2-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (10.5 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.7.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.4 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.2-cp311-cp311-macosx_10_9_universal2.whl (11.1 MB view details)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

pyhmmer-0.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.2-cp310-cp310-macosx_11_0_x86_64.whl (11.1 MB view details)

Uploaded CPython 3.10 macOS 11.0+ x86-64

pyhmmer-0.7.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.5 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.2-cp39-cp39-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.9 macOS 11.0+ x86-64

pyhmmer-0.7.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.2-cp38-cp38-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

pyhmmer-0.7.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.2-cp37-cp37m-macosx_10_15_x86_64.whl (11.1 MB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

pyhmmer-0.7.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

File details

Details for the file pyhmmer-0.7.2.tar.gz.

File metadata

  • Download URL: pyhmmer-0.7.2.tar.gz
  • Upload date:
  • Size: 11.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for pyhmmer-0.7.2.tar.gz
Algorithm Hash digest
SHA256 11336e2192d7f26e41d826e429cf8fedd3358b2d6e3b9f43a402966d1825eb7e
MD5 be83322f7752749f6c5b2dae7513b202
BLAKE2b-256 dfa92657add69b3315c419cb618bd3d7fd5980a6b0f30599111e6ed64ca3cef4

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 5b964e800905ee7163e2a1664ca6a8dcd5b7085a0a10299b0484c2048860bf88
MD5 622f7b08e75a517af9d999c731af0aaa
BLAKE2b-256 156580d52997be40674fd2c69b9828b058760e41b776b78af40a72902b86acd6

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a94ba7dc6a42d891861c7ef5a0c62107fbe3925017a22b1e5593b2f4b28f549f
MD5 bb44747c50a1ea221d203bc30e1d9046
BLAKE2b-256 8103cad0781ef73c4f4f52937840947c8d49cf2b2cbbd070c44b59e31d263a69

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 f028f2b765fba565219297c63ae1e680be5530ab9f10250baea8f9f8c4597768
MD5 c05ddefe06c7e7563a0a3d72516e15b5
BLAKE2b-256 d1f74bbda84916bb53d5427da8950d2c057db73fccdc33c632c941c524cdb53d

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f72cec49105c5945f80c5aab0af5083189fec871900d5a62b945d7a99dadd373
MD5 5f1b4f75468085645a9b80744e302271
BLAKE2b-256 71db4f93af5a00686347346a325ac84cae95fe6af81681c3489c83ac27629862

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 63bd37abc3f07458db07a435c4c82f60a96402bc52b4c4cad69057831bf549a4
MD5 3037355c0f775cccb2fc60d6f9fdcb44
BLAKE2b-256 b5c6c57c9f59c6d5c6e177aee95307f863bd058585dd0b0bfca4c34f7f13cd42

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 4f46e1b427c4cd1538abdbfb34848d3cdb75cda6d08875fc02aa0915f9261566
MD5 2dfa9f1ad3e3ce25f4689d4973681814
BLAKE2b-256 812744ffc8ed4f96c4c5cc2fef672cfb66c9d84f1f26bc903c812b0b4fa4d673

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 1f0d1341d58cc55681b8a30567bb0f5e7a8bb8eb5bb08cd3370d61ec9e68d35c
MD5 f694ef72976d6b52754bd8f7243345d7
BLAKE2b-256 538f886997e0b208edbab94576eb4dbfcd56dcac054700f7246a15300098e210

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 8cfe9614ac7c79b50cc007de5fb549e153f969da0ed1de4abe4411a30d7dfc2b
MD5 ccde5b2ba219428ade3455afc4539ac0
BLAKE2b-256 d009566484b8473a0216d855221e6b21b3ed6a7c8ec31bb08966cf5ce6dbe5b7

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 19a486a471783500c34405edef84cfc0efc70bd65b48f4e7be5fd2c4c1290e12
MD5 1712ee70b280d6a985eaea8daa0f597c
BLAKE2b-256 ee77b3e1ed830c69047796d6368d0e4e56280e3962650af7a7aa90fd5fada9fe

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 6b65b127bced3aadd5e5b901780acf789710eca8df58ac1e504fc0ee88184edd
MD5 ac2f5e97b1bb825e8d9fa3bcabca9bf6
BLAKE2b-256 233a5cfa3d249b3b2d9a15efefe9fe084d3cdb02f42a986f923165ed04d0d9c3

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 0b0e9a5fa2c5ceccaceded9786af4ff90d86d90f4c34fd7b79956d933938f2cf
MD5 1d2e03b5233688db06d197b5ee98b05a
BLAKE2b-256 adcfb678ce405273a64b4295f73a3357e988b2bce163862a9730a4b81650c626

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 4a8df8649b22e8d69c9c41a247a3d916ee225741a6c6c36688a073545489f889
MD5 472ec8c91d1d6c91dae6db46a4a8ddc8
BLAKE2b-256 a9390098a2887eb2db25c5cd259dc8635beca3b1aebb18d56b95fad851592223

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 21d5c3e6d9506cf4c4c914d4e2cf02e85c4ac85e3b74d1ad004259695bbde756
MD5 0eecfbd57b61a3dcf2740cba91958d01
BLAKE2b-256 0ee2f830bf6972a294569182d6ea4026af2a902fd6cdd6f06dd0ed62bce231e5

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 62819a28548b2254cdda8611b3e7f25dd40f896a1507e18ca84bb71adc8e02eb
MD5 f9e9f92d8f62cfeffee814c9e94db3a1
BLAKE2b-256 8c365fda154acd4dbde82c5f8ef5d4bb2532707c8b6b1138645673773ac139f8

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 af4aab48009fdfd6f3cd4789c2cfaaa96ffb437981938715d82e233071f4edda
MD5 fdcff200d1b967edb0812f2a93cb3743
BLAKE2b-256 c8dc9356b86de2f272c8e0cc4f577c72d3ae4e86727e26491f80cd8dc8063a9a

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 5bb04e3a7c146b8bf9b9b2bdefead02f460e359b94551a661bcebdbe409795d8
MD5 d766d23b2b2ed5d334fc35e7db8d4347
BLAKE2b-256 8dd5e46bec7d9e8c5e242f7f6c3286f8af3bc2ae4bf9ea42d91e1f8d426b7561

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 7f242331bf22421def79652f588484a69bdb68a823475ac5d64f5a1b8e9b11dc
MD5 5643daf0f28227514aa5de729e85be0d
BLAKE2b-256 ab2082e0a7504688cabb764417b4d141b0bb96f86632b8e499773e0dd391c9e3

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page