Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 PyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda AUR Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads DOI

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is developed and maintained by the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python package, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a DigitalSequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically as fast as directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which can help getting the most of multiple CPUs when annotating smaller sequence databases.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses or workflows involving hmmsearch, hmmscan, nhmmer, phmmer, hmmbuild and hmmalign.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available:

$ conda install -c bioconda pyhmmer

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch to search for Type 2 PKS domains (t2pks.hmm) inside proteins extracted from the genome of Anaerococcus provencensis (938293.PRJEB85.HG003687.faa). This will produce an iterable over TopHits that can be used for further sorting/querying in Python. Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

import pyhmmer

with pyhmmer.easel.SequenceFile("pyhmmer/tests/data/seqs/938293.PRJEB85.HG003687.faa", digital=True) as seq_file:
    sequences = list(seq_file)

with pyhmmer.plan7.HMMFile("pyhmmer/tests/data/hmms/txt/t2pks.hmm") as hmm_file:
    for hits in pyhmmer.hmmsearch(hmm_file, sequences, cpus=4):
      print(f"HMM {hits.query_name.decode()} found {len(hits)} hits in the target sequences")

Have a look at more in-depth examples such as building a HMM from an alignment, analysing the active site of a hit, or fetching marker genes from a genome in the Examples page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running @1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 4,489 protein sequences extracted from the genome of Escherichia coli (562.PRJEB4685) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 3 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

To read more about how PyHMMER achieves better parallelism than HMMER for many-to-many searches, have a look at the Performance page of the documentation.

🔍 See Also

Building a HMM from scratch? Then you may be interested in the pyfamsa package, providing bindings to FAMSA, a very fast multiple sequence aligner. In addition, you may want to trim alignments: in that case, consider pytrimal, which wraps trimAl 2.0.

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You can instead check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.7.4.tar.gz (11.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pyhmmer-0.7.4-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.4-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pyhmmer-0.7.4-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.7 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.4-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pyhmmer-0.7.4-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (10.8 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.4-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (10.6 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pyhmmer-0.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.4-cp311-cp311-macosx_10_9_universal2.whl (11.1 MB view details)

Uploaded CPython 3.11macOS 10.9+ universal2 (ARM64, x86-64)

pyhmmer-0.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.4-cp310-cp310-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.10macOS 11.0+ x86-64

pyhmmer-0.7.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.5 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.4-cp39-cp39-macosx_11_0_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.9macOS 11.0+ x86-64

pyhmmer-0.7.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.9 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.4-cp38-cp38-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.8macOS 10.15+ x86-64

pyhmmer-0.7.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

pyhmmer-0.7.4-cp37-cp37m-macosx_10_15_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.7mmacOS 10.15+ x86-64

pyhmmer-0.7.4-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.3 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ x86-64manylinux: glibc 2.24+ x86-64

File details

Details for the file pyhmmer-0.7.4.tar.gz.

File metadata

  • Download URL: pyhmmer-0.7.4.tar.gz
  • Upload date:
  • Size: 11.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for pyhmmer-0.7.4.tar.gz
Algorithm Hash digest
SHA256 e3ce2719628ce8f1c7ee0b604ca68ca33ccf0b3c77f785a720fa96aa86295c5f
MD5 ec3e3c1f34c2e7fb6021687cceafceb6
BLAKE2b-256 d4831abac79f144f82424fc1e3959db6fda4d5cedc251d804b0bedbfca9b90b0

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 71d1e9e24e0dd4a41687cfc8a3be04cf371981867d61600c142d05f18f27a331
MD5 14a9d3fedb6f9b7c2c5f69fb4fdfbe9b
BLAKE2b-256 a00f687049e993619cf051ce153f9f5d195290f9df872a46ef5ed55f168cb073

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 53a3745451e7dadd98c29401e9ad0dd7d9dc5647d174cfb50dcb299106aa05d1
MD5 d109ed9f9563640284a19413744a74b4
BLAKE2b-256 df9b9dd24babb7f1ef8a2fc26e39684ecc85193685b14cac93b1ab3821eada03

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 4fcc8c7b8ca632c429ca106ad2a00de415393b2733538e9c354ab145888762bb
MD5 f2797bfb828ef6727fc8638d995805b7
BLAKE2b-256 3f4bb94e77c892e9b6c656e753fe00a2e244dd63ceb1114004d3ab4c66940fc3

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 873daf0d541d8c330ed054d1f62beeaac492fb9ed3adb7eb1a9068df5980b9d5
MD5 612571df1bab695f67c17224787094fe
BLAKE2b-256 5f5b56febea19382034f45f8efd31e4cc025e88d47fcbf3b8a702e67daca65d2

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 5b6db46fbc358e4cc44f50573855d0412dffebd22f23f16de0dc146e5b3442ad
MD5 b19884a89a75935a309daafc5468f653
BLAKE2b-256 0c2f8b471542f68ccfd607cbd8edd5416bb054c454349f72d1ec1aca4546ab64

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-pp37-pypy37_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 81ffd68a7b938441ac6b6f656fbf1339309cec111f729ec7b2b95fd2b4c0911e
MD5 a915dbb118ffdcb61de24dd0779ef619
BLAKE2b-256 6ef644c176d316f9becce866f4536f026e433df3939052b5ad922013e324c400

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 9ff5facb74924a9cccc13101a63e9940cb2fbb9f7ac7ec35cf0471abeb5bd593
MD5 a3dfb85fe5ec270f20ddd9044d10ee51
BLAKE2b-256 4a3c13929dd0c58c62b1513bdfe818969e6f2ec4da023527b38565af60b91b8b

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 dcd657ef69c4e419246cd8675a41642a0aff7c8d2240d15595b982b25e93551c
MD5 44066eb5a2a0b4db37f0359f6aa6d448
BLAKE2b-256 e6c864e1960c86439c6002664077b2c25dd6543d65b0dd204c2e4c064fa0bb6a

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 248f46dac13a5677144dea96c52ca9872e087da4b0f078a8d26ebffe571696c5
MD5 837dcd4e9b07161cd6defbe665879c46
BLAKE2b-256 f4fe23f9ad6e15feb0605df76f7585d681d79675982bf34ce4767f9ad08a79d3

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 ad2951853425713ad1fb73edf66efd1e59e367818924cebca345e1f2f41d0136
MD5 55a6fb6b8f932a05cdf4fc8d02e4303b
BLAKE2b-256 ddbbfdb1068e287b6d09a73c1f7fa16aa2a850f0222f6419b8dcdc118815d409

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 04f62d7ea011fd6f932aae689b74aa5f494fd0ccebdcf795961fb4bb63d7bfef
MD5 b03b5d3bf2ec8bfd7d5e66cd173fee92
BLAKE2b-256 52dea269128e8891b97841930cd8addac3595dff08c1f90028619f7c73359cc1

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 42241530c8453bcbaa517e41bc3f8c4cac4b08ebcd3bc4cecb3fcb9523d15a59
MD5 2ad132f26cd7e339a293c8304e0019ea
BLAKE2b-256 b216e251dd701b9f1d6a63e3cca421108fc6cebb8538a9f6869296c6644a0c2c

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 dd0330b00bd38252027e2246c504e80d660e8240c4ff8e661e4b4d1bfaf70b9e
MD5 8490f75c89b86f8facd353daf8c60b51
BLAKE2b-256 a77a7e3d72b99ff0cede0022113042bc09f911041437a8f80827103b660bcecf

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 e3c1d04144e4424389cce8a32275b14de05788b41480e59f462fac411c81a426
MD5 ea7e32ba3bab8ab1b57e4ad19ae38298
BLAKE2b-256 3e635219330ab124df581bc1d23046ff5b43534e606e9bdcf2abfc10a50b62f9

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 3ba095837f9a209a054f786fcbdce21860f70306e220f50ecfd707b3ff27bccd
MD5 a76c5c636e3a27651c1e82946667a2d7
BLAKE2b-256 5626b8f4f0f8c71bf51d30885978636e4281ce8130bf57136d2d048da93ae2a9

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 70453303099f0f379888683b95b66c5cb197daabec0258e16fde9e7daadb5f3f
MD5 737e995ec8d4514b118c0e781de3e86d
BLAKE2b-256 68407dc955411414cee0db769d602cb2e8d593707786c9572bec05721491c5a9

See more details on using hashes here.

File details

Details for the file pyhmmer-0.7.4-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.7.4-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 9b8b488cfe1c5777630add99f7e788ead3b8bb632196e3269eba26b839cd3860
MD5 fe901b5f1d8b530eebee75028c34ae8e
BLAKE2b-256 a7442ca763c3a425e922554b8eb10113ceb8c195639c6fccea846da2551cd923

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page