Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 PyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda AUR Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads DOI

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is developed and maintained by the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python package, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a DigitalSequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically as fast as directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which can help getting the most of multiple CPUs when annotating smaller sequence databases.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses or workflows involving hmmsearch, hmmscan, nhmmer, phmmer, hmmbuild and hmmalign.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available:

$ conda install -c bioconda pyhmmer

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch, and obtain an iterable over TopHits that can be used for further sorting/querying in Python. Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

import pyhmmer

with pyhmmer.easel.SequenceFile("pyhmmer/tests/data/seqs/938293.PRJEB85.HG003687.faa", digital=True) as seq_file:
    sequences = list(seq_file)

with pyhmmer.plan7.HMMFile("pyhmmer/tests/data/hmms/txt/t2pks.hmm") as hmm_file:
    for hits in pyhmmer.hmmsearch(hmm_file, sequences, cpus=4):
      print(f"HMM {hits.query_name.decode()} found {len(hits)} hits in the target sequences")

Have a look at more in-depth examples such as building a HMM from an alignment, analysing the active site of a hit, or fetching marker genes from a genome in the Examples page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running @1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 4,489 protein sequences extracted from the genome of Escherichia coli (562.PRJEB4685) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 3 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

To read more about how PyHMMER achieves better parallelism than HMMER for many-to-many searches, have a look at the Performance page of the documentation.

🔍 See Also

Building a HMM from scratch? Then you may be interested in the pyfamsa package, providing bindings to FAMSA, a very fast multiple sequence aligner. In addition, you may want to trim alignments: in that case, consider pytrimal, which wraps trimAl 2.0.

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You can instead check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.7.1.tar.gz (11.0 MB view details)

Uploaded Source

Built Distributions

pyhmmer-0.7.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (10.5 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.7.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (10.5 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.7.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.1-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (10.5 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.1-cp311-cp311-macosx_10_9_universal2.whl (11.1 MB view details)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

pyhmmer-0.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.1-cp310-cp310-macosx_10_15_x86_64.whl (11.1 MB view details)

Uploaded CPython 3.10 macOS 10.15+ x86-64

pyhmmer-0.7.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.1-cp39-cp39-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.9 macOS 10.15+ x86-64

pyhmmer-0.7.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.1-cp38-cp38-macosx_10_15_x86_64.whl (11.1 MB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

pyhmmer-0.7.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.1-cp37-cp37m-macosx_10_15_x86_64.whl (11.1 MB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

pyhmmer-0.7.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

File details

Details for the file pyhmmer-0.7.1.tar.gz.

File metadata

  • Download URL: pyhmmer-0.7.1.tar.gz
  • Upload date:
  • Size: 11.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for pyhmmer-0.7.1.tar.gz
Algorithm Hash digest
SHA256 e746cfc3b352656757286106fd210763f835ac20cf8466141bca5588567bcb5c
MD5 6e5175cd8a9f83a01d28ef16609c9abb
BLAKE2b-256 08b9653e773ff15145759bf2b8365c88e2f5aad1f30addbb1eddbde05bde02d9

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 b8046ea8f1f64cf29d88e483dc494a18a59566173884539f121115f499e808d3
MD5 70cbbf5c9877461dea814488d5655d67
BLAKE2b-256 22b0978fe1e68b3bdcfb1662358de0b11229dcb0af7b04fec6efdabe0270389a

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 7566eb1baa3c56a46ea30e65799d3fafff63765de8c6d0c58a8880abbb6b67ef
MD5 d32ec3441a24a457574a1bfaaa9ef706
BLAKE2b-256 4564476e25cf2eb37d76881022a6c42c8da9e06a93b1af51fc596ede5c6d68ce

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 7b4b83a350bea398f4c92ebe33a2f0274f3183f2abcf8d0d8403294ab5d91cae
MD5 6162db61ff0e33d1016f404742b64fbc
BLAKE2b-256 95a2173fe1c90b803e6bfef9190ece6e479552be0f8afdb606e8639d543a35df

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 fad4ef46f645154acf6803af9b1dd260d4f7b84f56514f8edbd8d980ab0827c9
MD5 68835b2397e069aee321f1df3f333d20
BLAKE2b-256 83d3462aea5aca8f8169bee7b77b380d0221bb8b452e07b54967b352fc9383bd

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 ec5b5037a3ac535f31b2fe73043e1f5aac8908d912913a30869305804a754f75
MD5 963c920b498992ae7e5d263ae694b725
BLAKE2b-256 5e77bde94d470ac47242abc0754a02a21311339c2d4fc367631d91a940946ab1

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-pp37-pypy37_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ec1688483e9c1a5325721d3354c31e005243f778cff66543975a66e4735a16a0
MD5 3246359d0b172d3afdd894ea7cbe795b
BLAKE2b-256 f595fb8f90d881b6253d0ca29b0fd714b2237733721aa108b20272801822d7dc

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 5cbc50876f84d3488c05bd7ff44f67301763c5410b5b715904d46fce98c5559b
MD5 1ec51d337cc1bff64f09bf5499a904f6
BLAKE2b-256 2b62e270b0e792b85332ebbaccf137a9b3c33ff035c4be12ec3760aed67dd5da

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 465b48130d4ae65868150e9d376e363a9beea3e621afc3a46ae23496d6fa12ce
MD5 50651b36ef76cf68025a711b02252878
BLAKE2b-256 95633843021512df961cac8ea24e3796995ea0206a35709bb019adc471c5e115

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 788b58392faed12224b8d5ac50950e3073613b5ba6a8fe44b4ceaf8cefcc61d9
MD5 54bed599dbd23f80cd229915520be7f1
BLAKE2b-256 c71f314985aa53011a9b7a217f92cba4448c77c535c55b48308def97d4ab40ab

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp310-cp310-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp310-cp310-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 074f7e6c28effd0d0caa9373fb6096772ca75b35025c2322ac500b9504293652
MD5 5e6a8b5db5509592ccf99a06a8619e9a
BLAKE2b-256 4e9868ecc5a2aa3be261990a6011fcf7a5279895e8d04ec92802771dd216338b

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 28c88b76c82a30164844be6928617145169cbda584baeccec6f8d4bfbf9f2ffa
MD5 637701ad3c97f014f652437efd2bc190
BLAKE2b-256 93f3b3f6892b4bb54f3e74c255b7e977acb1b0149601197c40d0e388c481897c

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp39-cp39-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 65e19b03f35d6005b4549cc888b5926b8d49ece0ef08cdfd4e0b08c88bf9d520
MD5 3c9ca8f240c8193526f638010c52d377
BLAKE2b-256 6561e5d74affedd6c3ff3d2c17e8dd278921194411035ff093dbb17a8f3e4525

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 64c71975475eb8b7ae9bbc86ce97b088c3dcfbec40bfcd9271fac5ff28a3c51b
MD5 faf1351c369968ea8136b5d4eb7c07d8
BLAKE2b-256 651657009ec67d1b33dc1bddfd40c054d0fa2ed0970c357249b81ca9b2c80ac4

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 f56260f6bd1af57b4cbd52948e3674283387d83d61725b650313270936596e7b
MD5 307373ee9b1ac5aef707e3086c67ae68
BLAKE2b-256 29a9acec031f4435e796440521c54f78f978cf91203615c29b07c7c9697840e7

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 a3531bd272059b2074d2608251deff5ee6bb3057058b0b338b77447698675c4c
MD5 a9474ce9c8ba92abaef84b996bfa88a0
BLAKE2b-256 9fb188afa91956eb78f5b19f3e44fabef47e0bc8bd2028c33c2f1740a2ddf406

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 c190378796e721cbab2894dad43b0f9a16c527e3542dde358ce6f9c40bb4ac67
MD5 9faa217cc5ebef63b6e0f90105d624c5
BLAKE2b-256 2e492363ee3fe7995921638cfa8f2df825168c47e0bd05f24374319e6bc8fa5e

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.7.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 70687b3b8cf7aa828b7cd051e4015ccf5315f1bcd126517173498abf2db64974
MD5 3710533e7f27d2fccaaba9d296d01ced
BLAKE2b-256 ae0df38735e35b50b639b70c9a74b58ea25614a81b0a188d89b692139001bca0

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page