Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 PyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda AUR Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads Paper

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is developed and maintained by the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python package, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a DigitalSequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically as fast as directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which can help getting the most of multiple CPUs when annotating smaller sequence databases.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses or workflows involving hmmsearch, hmmscan, nhmmer, phmmer, hmmbuild and hmmalign.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available:

$ conda install -c bioconda pyhmmer

🔖 Citation

PyHMMER is scientific software, with a published paper in the Bioinformatics. Please cite both PyHMMER and HMMER if you are using it in an academic work, for instance as:

PyHMMER (Larralde et al., 2023), a Python library binding to HMMER (Eddy, 2011).

Detailed references are available on the Publications page of the online documentation.

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch to search for Type 2 PKS domains (t2pks.hmm) inside proteins extracted from the genome of Anaerococcus provencensis (938293.PRJEB85.HG003687.faa). This will produce an iterable over TopHits that can be used for further sorting/querying in Python. Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

import pyhmmer

with pyhmmer.easel.SequenceFile("pyhmmer/tests/data/seqs/938293.PRJEB85.HG003687.faa", digital=True) as seq_file:
    sequences = list(seq_file)

with pyhmmer.plan7.HMMFile("pyhmmer/tests/data/hmms/txt/t2pks.hmm") as hmm_file:
    for hits in pyhmmer.hmmsearch(hmm_file, sequences, cpus=4):
      print(f"HMM {hits.query_name.decode()} found {len(hits)} hits in the target sequences")

Have a look at more in-depth examples such as building a HMM from an alignment, analysing the active site of a hit, or fetching marker genes from a genome in the Examples page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running @1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 4,489 protein sequences extracted from the genome of Escherichia coli (562.PRJEB4685) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 3 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

To read more about how PyHMMER achieves better parallelism than HMMER for many-to-many searches, have a look at the Performance page of the documentation.

🔍 See Also

Building a HMM from scratch? Then you may be interested in the pyfamsa package, providing bindings to FAMSA, a very fast multiple sequence aligner. In addition, you may want to trim alignments: in that case, consider pytrimal, which wraps trimAl 2.0.

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You can instead check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

This version

0.8.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.8.2.tar.gz (11.0 MB view details)

Uploaded Source

Built Distributions

pyhmmer-0.8.2-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.8 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.8.2-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.8 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.8.2-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.8 MB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.8.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.5 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.2-cp311-cp311-macosx_10_9_universal2.whl (11.2 MB view details)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

pyhmmer-0.8.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.2-cp310-cp310-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.10 macOS 11.0+ x86-64

pyhmmer-0.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.2-cp39-cp39-macosx_11_0_x86_64.whl (11.3 MB view details)

Uploaded CPython 3.9 macOS 11.0+ x86-64

pyhmmer-0.8.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (17.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.2-cp38-cp38-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

pyhmmer-0.8.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.4 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.8.2-cp37-cp37m-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

pyhmmer-0.8.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.4 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

File details

Details for the file pyhmmer-0.8.2.tar.gz.

File metadata

  • Download URL: pyhmmer-0.8.2.tar.gz
  • Upload date:
  • Size: 11.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for pyhmmer-0.8.2.tar.gz
Algorithm Hash digest
SHA256 1f34ff301d77a2d49060aa3cb4c320119bc4b79a27c39563aa902f3c77d6d23a
MD5 386666fd90636c8fa225d8a360e5c92e
BLAKE2b-256 f1c9c047535dbe805d92a3d6fb8b602f19728c4c91d897a4da2f8d46f5f83cc6

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 9decd51f3c3717a0af3f80a8bf2b43d450cf32e75dd818232ad40178426c5b64
MD5 92989884b5b5bc058ee29725f9644fd9
BLAKE2b-256 816d1fd4669db2d3a14e80bdb59377b0a537b26fb98798cbce9a4ccc2943d5b1

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 69f248f9f43098fc723ffcd7621e0e605417a261d57bdc172fd76291f4f2746c
MD5 de2f9c4120f223c73a2ed36446dbf7b0
BLAKE2b-256 f67c17274676ff47462404c50fd7d03eb2793ab33d0405fb8830c674f29cd7e0

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 8e2e8348f549fda3a9363ddd77e9f1226baed373c6d2e21a73de01e8163d9aa3
MD5 f4e255cd1e519564ad92e863db5e9d27
BLAKE2b-256 7f64845f17d9680d7703d8deaa729141dd33da734b9602cebc224a7533c0f239

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c6cd2ede881388857c0592ee1ea3e71ab56903662d71eb400f341b22a42197bb
MD5 8330766ceba2d01bc3fc91c975592f21
BLAKE2b-256 c0f428ed40514d1f483669b4f80de51a7c1770ebd84255890ea468f641ef17c4

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 d6075ea4f4303297ae82699370fada8a73e1016f221212814b0344b90ac6c104
MD5 4b88e0535c861f8ffd01ccf247499876
BLAKE2b-256 47d94e2212224e53a8b9373c0831a3c16d115dce9c6e2c847c867a1f1b8284fd

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a08e058479e382944d70d622f9a1fd49c62555ed5750b2950cf56c1bb7659596
MD5 78dd3170fb1e29ab8dea2df8e3bfe1d3
BLAKE2b-256 1aeed6c8985da391696aaa09eb694172201c1eead5f366a8e61790c735c525e9

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 329c3dda3252b8d8843d504bfdfe36874bb9141c28771eca0491ee595ddbaa46
MD5 06fc4b031adab8b0511217fd30b4095e
BLAKE2b-256 35622dd4b543a1be28c87b20c0176c7f6c9793fec2f948c67dc4d9e15e920d74

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 53c328762c61f08975ca07bdfeabef95abbb60236e96dcadde3f64752dd39207
MD5 7c1d22aed38b40c91e9c88173fe0cac1
BLAKE2b-256 1d34b23cdbe9358907f35c6cbf7af8fd223ba982372fa67842540d2a453bb6bb

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 2b0e0957943c3766c8e7660c8607116d29c41560145cef413e7f395e71658f49
MD5 09996fd08756badf98fc715bd0c6abf5
BLAKE2b-256 8b7f11a8225560199c6b92c062668c24691cfdf4d6803930ea51578c0ae0f2c6

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 9bbf4981531543b30a1da679992f270b7115b4735613d2cbd765093f1595ec7e
MD5 52c0eadafd67a2b559a3b448ef964017
BLAKE2b-256 14ee272986b394c29bd06766c742c2bb717747d0f10fe27b9d639878d0b91b7b

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 ee308e6b49b76ffc0183afb5c7d8c43c7df823dab7b2c79120f4e9e63bbc07eb
MD5 0683d0c719bca1902e7e895659bb61b3
BLAKE2b-256 dbd78cbf5c5a3b14fa0ffcbf08c862f2f9e2ac6753be228af04e8feaf724b4da

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 22c0ac4996afbb6421bbba1bc40b8241dd689d53b3ef5e283dc78fb5b5c38614
MD5 964026fa4b07889d9dba76c3ff152058
BLAKE2b-256 48aa173348482e94cb75d7ca8140cb644583294a51c947263b075ae98d7d451c

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 d73a2be27ced6e0105e5904a174556b51a97faf0018cb3b3a5f7153a1015b7f2
MD5 58e06e4f0424e8bda6a5ff4e86beeaa4
BLAKE2b-256 3b96a3db6773bdb59ce89c4b5806de0ea3d37b07c4bf930fab20dbc60873c737

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 b29cd4d630d6d4d79cbd0e5a9fc04fbdaecf19bd7b89b2d0db081aed5313c99a
MD5 f69b3670cccdc70653a1b9d7fd045691
BLAKE2b-256 19ccdd603f6c31b90171f2a74e8bc83cd981e3e5c9275035127f32b2e709e0c1

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 9540a6f32bd5a91aafde592d34031457963e90525c302c8d94709c33a50bd07e
MD5 172f38c773996d5dec11a2e6f43e8fc8
BLAKE2b-256 5df25bf9ca5b718e665b613914fa33b5299040268e289c8bd0b0794c5429981d

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 eecf23f2ad649d3e38ac029289feae445d37bf8921ad4f6fab5a62a020143b4a
MD5 d7cafefffcd795c2c1a9acf6f49a8e14
BLAKE2b-256 3804689d911435b871b9f9beeed09997002bd8869a80efb217cd4dc12fa3be0c

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.8.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.8.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 5a1c920d546c57df03e63da0055c2e45cc6be826430a15c15c4091bbc08ac9aa
MD5 d9b2ed59cdf1eb02e23125a037ecdb07
BLAKE2b-256 0ab4d6842c31ab612714e693d13852d0b8e30877583e2c1e4c7a54e5072cf6e3

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page