Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 PyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda AUR Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads Paper

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is developed and maintained by the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python package, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a DigitalSequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically as fast as directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which can help getting the most of multiple CPUs when annotating smaller sequence databases.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses or workflows involving hmmsearch, hmmscan, nhmmer, phmmer, hmmbuild and hmmalign.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available:

$ conda install -c bioconda pyhmmer

🔖 Citation

PyHMMER is scientific software, with a published paper in the Bioinformatics. Please cite both PyHMMER and HMMER if you are using it in an academic work, for instance as:

PyHMMER (Larralde et al., 2023), a Python library binding to HMMER (Eddy, 2011).

Detailed references are available on the Publications page of the online documentation.

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch to search for Type 2 PKS domains (t2pks.hmm) inside proteins extracted from the genome of Anaerococcus provencensis (938293.PRJEB85.HG003687.faa). This will produce an iterable over TopHits that can be used for further sorting/querying in Python. Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

import pyhmmer

with pyhmmer.easel.SequenceFile("pyhmmer/tests/data/seqs/938293.PRJEB85.HG003687.faa", digital=True) as seq_file:
    sequences = list(seq_file)

with pyhmmer.plan7.HMMFile("pyhmmer/tests/data/hmms/txt/t2pks.hmm") as hmm_file:
    for hits in pyhmmer.hmmsearch(hmm_file, sequences, cpus=4):
      print(f"HMM {hits.query_name.decode()} found {len(hits)} hits in the target sequences")

Have a look at more in-depth examples such as building a HMM from an alignment, analysing the active site of a hit, or fetching marker genes from a genome in the Examples page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running @1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 4,489 protein sequences extracted from the genome of Escherichia coli (562.PRJEB4685) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 3 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

To read more about how PyHMMER achieves better parallelism than HMMER for many-to-many searches, have a look at the Performance page of the documentation.

🔍 See Also

Building a HMM from scratch? Then you may be interested in the pyfamsa package, providing bindings to FAMSA, a very fast multiple sequence aligner. In addition, you may want to trim alignments: in that case, consider pytrimal, which wraps trimAl 2.0.

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You can instead check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.8.1.tar.gz (11.0 MB view details)

Uploaded Source

Built Distributions

pyhmmer-0.8.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.8.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.8.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.8 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.1-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.8.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.5 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.1-cp311-cp311-macosx_10_9_universal2.whl (11.2 MB view details)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

pyhmmer-0.8.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.1-cp310-cp310-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.10 macOS 11.0+ x86-64

pyhmmer-0.8.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.1-cp39-cp39-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.9 macOS 11.0+ x86-64

pyhmmer-0.8.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.1-cp38-cp38-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

pyhmmer-0.8.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.1-cp37-cp37m-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

pyhmmer-0.8.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.4 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

File details

Details for the file pyhmmer-0.8.1.tar.gz.

File metadata

  • Download URL: pyhmmer-0.8.1.tar.gz
  • Upload date:
  • Size: 11.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for pyhmmer-0.8.1.tar.gz
Algorithm Hash digest
SHA256 d090d07c35b03265e4f6f47f05d4fd3206eb633478b692066364e46622b5db11
MD5 5dc8cee6e5b6487b8fef52a9ff8c0edd
BLAKE2b-256 3d12beb37509be629206b9ca05f631533271c4bc4ed4f52891f772ca1ec0219d

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 371da042678c1246836839cd49824bf18ecc8a49e25b6d9cb01fdeba1c0db84e
MD5 c74ebdbf62948c9360dbaf87a6439889
BLAKE2b-256 f6fe6568f474e66ade3d023fd9c1a9d3e9c0d612eebb13fd2abbcaa32e0c07d5

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 36a45d9bf4b42709ad10e04be9c0f4bc26f7fad40410589f0cb2604747f84a89
MD5 42a6376be0dc194bf2307f7127183f8e
BLAKE2b-256 f21adaef14e72d7b7115c814a9fcc2ff6c34cd79b777221d9b374e0452724a03

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 d7ddecdd290c17e18a8b1e304b02d3d2cc8a483419a0629255f3fe768a4528f3
MD5 1a6b34eb4ee72825bc469cecbfae65b1
BLAKE2b-256 d5302299b81eecc97296c8ebddc568823dea1272ae076726fc1d32f4f74a6218

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0b5adb02573c4923cdd12ecf3577317b6df17b53ec24bec75daaf9f1eadbd580
MD5 9b5ec1797952b8c26d44ee7105e3cd6f
BLAKE2b-256 047314e48a4ec9ebb3988168ba0763e8982bd7f5d6abcf375858883700e00010

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 a23a3de39876f794e72365148d5c22c33bdb3dee1206c453df325d9902015f0a
MD5 20b473a9686e412102956b15e2eb8664
BLAKE2b-256 f7a52ab50f4c09e8d03b588bb8ffc474ba51320af2f54e08a0a8093ab4eb7778

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-pp37-pypy37_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 eb2dda8377872e0bc14880a43ca217b12c97bcb8fc346d733c78f4286d66ed28
MD5 3b8d716d1492eb35c7abc94ba824afd2
BLAKE2b-256 9aea326a14e08d354de541be7be649f769db590a1f3b536494b0dfe98ceafe60

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 06854de81db82479adbb7ea6cb4a653f0d1117fb4175adf4c0f1ccb55c990524
MD5 dda7472e6758b78406e39e2c86d415d1
BLAKE2b-256 e25e3d4df0d800d8be0c8f4a90d493404955fb3ff71cec1ade7bf3c032474b28

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 c9850039af7fbae27f54d596ff3356ab8bdbc1be7290277d9f48db05cf5a9703
MD5 fd8bde5c8c86cb5d54c84542ba7c73b2
BLAKE2b-256 c94f9013da7fe5689a83b3200fafe885b1bed7832a93160c7f53d4851d5914eb

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 e739cb4774d664de3374aa9f6c911a864ff19c7ec5f86901f504089ac6a91bb3
MD5 75130fa9661b2a5c08f05e26185f48de
BLAKE2b-256 8cc501ab9a15c9263612433b128d1e65c0ee1f3cb18b9503c8ddcacf38837914

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 7c7aa228d2b6191dee62dd1bcbfcc1b8e79f50c96456d56ee660e01acd1bab9a
MD5 ae79ca020c3797a57881d5a9adade47e
BLAKE2b-256 ed79dbf1e41bb908bec699f5eabb989230c8758fb8075b7e4f9c2b33108178ec

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 4d1d67bca15a66ca0dd4b80c6d52c4c619982b9d0e02c6e12e1e965aafd17581
MD5 bfc7c3a84c75aaa7a982ca07dbd3492d
BLAKE2b-256 ac34f2758c62dc6bab178927c1677f19aabdff0fc663981fdeac3e6d4b75baff

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 e19f59213279945733c3167372f8004e81ac7c889073001ba114fd78090b8d90
MD5 b144285ade58c20588c5e18a3daf6ab4
BLAKE2b-256 c2934bc40a01c44326555939dffa0cf4c8124f62a68e6ebb55342b7b8f9d5e70

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 a3636654ee3a774cd0f4094be2ca9c85484731f782f2e506912476c4b5b3373c
MD5 2d73c5272dd647c04a46de551e0739c1
BLAKE2b-256 6a9f6d6cfc59ce70e3a74bdbbef07ab0a14d6586d754404d6b8f4f7baa0bd32d

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 01727d13fe7fa759ed904102dc48b82c011a7a5911218ec387c9c9baa979754a
MD5 b817e41ef2982185723592baa75358e2
BLAKE2b-256 1f27350d8e7551a1df76c4a4c21b2194f52f442fd6da03b1a7058c8a3dfe6c38

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 d711dddda0c2b698845570c808c02a566d1d23411ceee3f710f850ff26ebb3d5
MD5 78b970661b894a46ca6e5775637c074e
BLAKE2b-256 36e08f7d67c887b2acbed5d598e1a8a130b627e72eaafa483bc6314061ec8e53

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 277061de789b20622591fbd5787295ec4ea3d80a674619128f18c466b0d463e4
MD5 777dbc3c5fc4245458f71634d57c18b5
BLAKE2b-256 75ede314f33388b55a21b4e2ddee8b3df2a38ff806af6428dcba40415b04b997

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 8df895caabf8a9e5f2f640b855d1f6a9f9e36119f683a37d5884e071cfa7c1f4
MD5 7f731f797ac7582495c9ee2195e0ac63
BLAKE2b-256 28465e13639f2d28bc37119046ddc63b6aa1b2ecc1cd911cd03c6de1bcaa2a13

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page