Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 PyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda AUR Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads Paper

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is developed and maintained by the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python package, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a DigitalSequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically as fast as directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which can help getting the most of multiple CPUs when annotating smaller sequence databases.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses or workflows involving hmmsearch, hmmscan, nhmmer, phmmer, hmmbuild and hmmalign.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available:

$ conda install -c bioconda pyhmmer

🔖 Citation

PyHMMER is scientific software, with a published paper in the Bioinformatics. Please cite both PyHMMER and HMMER if you are using it in an academic work, for instance as:

PyHMMER (Larralde et al., 2023), a Python library binding to HMMER (Eddy, 2011).

Detailed references are available on the Publications page of the online documentation.

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch to search for Type 2 PKS domains (t2pks.hmm) inside proteins extracted from the genome of Anaerococcus provencensis (938293.PRJEB85.HG003687.faa). This will produce an iterable over TopHits that can be used for further sorting/querying in Python. Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

import pyhmmer

with pyhmmer.easel.SequenceFile("pyhmmer/tests/data/seqs/938293.PRJEB85.HG003687.faa", digital=True) as seq_file:
    sequences = list(seq_file)

with pyhmmer.plan7.HMMFile("pyhmmer/tests/data/hmms/txt/t2pks.hmm") as hmm_file:
    for hits in pyhmmer.hmmsearch(hmm_file, sequences, cpus=4):
      print(f"HMM {hits.query_name.decode()} found {len(hits)} hits in the target sequences")

Have a look at more in-depth examples such as building a HMM from an alignment, analysing the active site of a hit, or fetching marker genes from a genome in the Examples page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running @1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 4,489 protein sequences extracted from the genome of Escherichia coli (562.PRJEB4685) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 3 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

To read more about how PyHMMER achieves better parallelism than HMMER for many-to-many searches, have a look at the Performance page of the documentation.

🔍 See Also

Building a HMM from scratch? Then you may be interested in the pyfamsa package, providing bindings to FAMSA, a very fast multiple sequence aligner. In addition, you may want to trim alignments: in that case, consider pytrimal, which wraps trimAl 2.0.

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You can instead check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

This version

0.9.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.9.0.tar.gz (11.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyhmmer-0.9.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.9 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.9.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (10.7 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pyhmmer-0.9.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.9 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.9.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (10.7 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pyhmmer-0.9.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.9 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.9.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (10.7 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pyhmmer-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.9.0-cp311-cp311-macosx_10_9_universal2.whl (11.3 MB view details)

Uploaded CPython 3.11macOS 10.9+ universal2 (ARM64, x86-64)

pyhmmer-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.9.0-cp310-cp310-macosx_11_0_x86_64.whl (11.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ x86-64

pyhmmer-0.9.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.9.0-cp39-cp39-macosx_11_0_x86_64.whl (11.3 MB view details)

Uploaded CPython 3.9macOS 11.0+ x86-64

pyhmmer-0.9.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (17.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.9.0-cp38-cp38-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.8macOS 11.0+ x86-64

pyhmmer-0.9.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.4 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.9.0-cp37-cp37m-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.7mmacOS 11.0+ x86-64

pyhmmer-0.9.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

File details

Details for the file pyhmmer-0.9.0.tar.gz.

File metadata

  • Download URL: pyhmmer-0.9.0.tar.gz
  • Upload date:
  • Size: 11.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for pyhmmer-0.9.0.tar.gz
Algorithm Hash digest
SHA256 2179bd503eee2dd8c91f99c2b47c4c2fd13902dda1e5960e6ca0756d57c8569d
MD5 d4699d375281ba57e3ed92c6948d792f
BLAKE2b-256 b9ebdcae2181c127c3cd8334a88b56729814e1aad5eb0f76c620fe87dc6b3043

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 fab846df6c641aeaedb92422fdf2affa3a949323f9c071950b658d3d3c63a34d
MD5 bbe831f368bca9e94d8b8128a68f0b24
BLAKE2b-256 4d9c59beb6fbc6d19cadee582a096245e042f8c59d9cf8f7770a843cf66d0636

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 93806af5029628493db6b7bd1c36b39cff9599cd0d3547b94c153cc2d89adeb3
MD5 0eae40a0083c57838bec81b414ca2e68
BLAKE2b-256 47e3e830f6d54ae9dd2e07c6e399d18f335749bda3e2eb608682494a447c8988

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 1806e5b4ee2e0596028f319387c47752542bdb0a318d25a5fa96b98b64995e39
MD5 4939072a60423d32fd3e47fd7d46fa68
BLAKE2b-256 2225fbf2e221b90b0297df47d1688c5015fdc62c831cd4b3b689eac524bf192a

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 85528e9e8d55abb3e9cee199aa49934e4625426dca420993ae0574c4670094cc
MD5 1d4bc30fad78bdbdace8e1fe4e2fcc22
BLAKE2b-256 6d8ffc5251bc0c344176a4208ab7c7f56efc67d924c8ad9b5e3e7594efef517e

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 102933aea9c5deac9c7d4a7d6d99e3a4b74183e725cebc6ec3ba1ab8ff66cede
MD5 b95a4be55c09b707c91dc8f293c1b1c4
BLAKE2b-256 f66e71cab0cdd70654e1c245ae6ef4a691ceb97fbbf6cc8551c237add94b6a05

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 9d6440a2e1aa7eab785c4733cfa9082591116dea5f0c4e504471f1746448a9f4
MD5 90ee8a083de64d141646133c545dbbf0
BLAKE2b-256 60d508cc9d6e977ecf79b4cdceced482ef87bad959137b232d75f21ebf7ee4fa

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 490fd9f503d2345e867b98d3c66adea7a5acfe1b3ba0a7b9460a6669b5082932
MD5 2acb546ff106cce031ac5bcedbb7485a
BLAKE2b-256 7f61f2da88596a2295848c7698a913f4534492e9f78382d1aa8047f158619b3c

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 ec3fd6aec7c2e0a54d26a9fae49c2d14b5e2abc6b17ace7f6f4b5d0002931032
MD5 c55e2b145ec6851c589f52b1f9001453
BLAKE2b-256 02998e05a5b9a8664b7cf8a2c59f0b7f05ada19b30354fafcf0a240bff65fd79

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 3172b51ac9d1a5be049b11eaeb30ca3ea1aeee29b05b9b3b69472e33a9e07132
MD5 72d45fd4424e79be1d6806ecdcf6e1e3
BLAKE2b-256 018fcd5c956a65373f873d4bcebd59546abf2103f3845b81da10d7897a9cfddc

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 c948728ff0f5ceacc13eaf374d53b7490d342d8371717d1616c228dcd71c3e69
MD5 cc90e4112c7b7f3ac7d9fc70c82a69e9
BLAKE2b-256 98314b8b4631703b3a96ba4485306568c658f9f2d5f56161c705db8f695dec71

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 e28c11520d63113adc68bf116c6a253ebbb4babaceee5b1bdac5f843b25f7cbe
MD5 20d7d1dcc6c5e02e95399dadccfd7716
BLAKE2b-256 5f0a3daa61bb57b91e999fcd546a3acb9dc50920c683a9475692975f1a0059e8

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 a9a969a5bb6d4ed64281c8e2e2770935a16bf3d32ada0691cf158b3e95c20205
MD5 540f3d7ddc2ed8c120e964b5f2245819
BLAKE2b-256 0f1450cdde7eaae89d46288e2000cfbf79c84f882a7671cf5d7b3a7f9b3a425c

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 8fd753dcf5833a9d723f35ece65752d4b11c8e0a7022db0b83237e41ebdf1124
MD5 1e8152ee8569f003b87b1873ef557c45
BLAKE2b-256 df53a3e0b00d22f6609e09fce8e8869b3117682ba1aad8376f38e0bc22b0f376

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp38-cp38-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp38-cp38-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 d8e969ecda05eb509f5b7729c77236ac19a6f44f5ff0dee5d0de9c4e014075b7
MD5 013708178a0db0dcc893ecd08b08fc27
BLAKE2b-256 5faa914bfceac46ff2285edbb90bd21e9cfd87440cc01832d83c788c079f87f1

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 dc2d304b26726b063ef4ed2c9e359456e56834bbf6836610ab752c74dcc043f2
MD5 8ac64de69c87b47a7a9eacb6cb7beb7a
BLAKE2b-256 7e27d1ca91dfdca150a55265a9a3c3edc402600c5f0e8e0cc73db13d68b3f2bb

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp37-cp37m-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp37-cp37m-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 5cde13029ee90c46ed9d5d49f0d4a4987a7d8b50ad430b9f12e799c53f7ff43e
MD5 c618f6ed7f605f67ffdbc95069953d82
BLAKE2b-256 3013127d07fb43f4b3b2889d73a77c0181a92a278ea6aefce0f8eb0811e73c15

See more details on using hashes here.

File details

Details for the file pyhmmer-0.9.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.9.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 02dbd51cc93092b3eef9238a03a7330f99181f885a6d37deaa8670d119acd1cc
MD5 00e52a23e58614bcccd9f33daff658bf
BLAKE2b-256 ec3c64df043166c02805c895acd8b1843d5a0d4f76ef923600cb4043441daa59

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page