Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 pyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda AUR Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads DOI

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is maintained by members of the the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python module, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a DigitalSequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically as fast as directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which can help getting the most of multiple CPUs when annotating smaller sequence databases.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses or workflows involving hmmsearch, hmmscan, nhmmer, phmmer, hmmbuild and hmmalign.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available:

$ conda install -c bioconda pyhmmer

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch, and obtain an iterable over TopHits that can be used for further sorting/querying in Python. Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

import pyhmmer

with pyhmmer.easel.SequenceFile("tests/data/seqs/938293.PRJEB85.HG003687.faa", digital=True) as seq_file:
    sequences = list(seq_file)

with pyhmmer.plan7.HMMFile("tests/data/hmms/txt/t2pks.hmm") as hmm_file:
    all_hits = list(pyhmmer.hmmsearch(hmm_file, sequences_file, cpus=4))

Have a look at more in-depth examples such as building a HMM from an alignment, analysing the active site of a hit, or fetching marker genes from a genome in the Examples page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running @1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 2,100 protein sequences extracted from the genome of Anaerococcus provencensis (938293.PRJEB85.HG003687.faa) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 4 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

To read more about how pyHMMER achieves better parallelism than HMMER for many-to-many searches, have a look at the Performance page of the documentation.

🔍 See Also

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You should then check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

This version

0.6.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.6.0.tar.gz (9.9 MB view details)

Uploaded Source

Built Distributions

pyhmmer-0.6.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (9.2 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.6.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (9.2 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.6.0-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (9.3 MB view details)

Uploaded PyPy manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

pyhmmer-0.6.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (9.2 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.6.0-pp36-pypy36_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (9.3 MB view details)

Uploaded PyPy manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

pyhmmer-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (14.0 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.0-cp310-cp310-macosx_10_15_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.10 macOS 10.15+ x86-64

pyhmmer-0.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (14.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.0-cp39-cp39-macosx_10_15_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.9 macOS 10.15+ x86-64

pyhmmer-0.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (14.2 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.0-cp38-cp38-macosx_10_14_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

pyhmmer-0.6.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (13.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.0-cp37-cp37m-macosx_10_14_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

pyhmmer-0.6.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (13.8 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.0-cp36-cp36m-macosx_10_14_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file pyhmmer-0.6.0.tar.gz.

File metadata

  • Download URL: pyhmmer-0.6.0.tar.gz
  • Upload date:
  • Size: 9.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for pyhmmer-0.6.0.tar.gz
Algorithm Hash digest
SHA256 9579b481e09387b97548aa7fa8920f3ff143805849e146118cceeeff27ea04e5
MD5 b7f04ab85c1e81079df420418b3c2ca5
BLAKE2b-256 6634775e5243113c2d97ae0ed9e6a39f145549bf85c6215840b728fa9a3f6b89

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 678f2898262bcc9eeb8969033937bdeae754cf3583d7eb1fa63f2d94853e663f
MD5 6256cec963bee3dccde95b8decc10434
BLAKE2b-256 5be57dae65e0e3f3724535d49307db46975b4eb1bc62cc77a1b2dd26ddc308a9

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 9f00ad4b7fb881394a81c3edbf76e04dc20b07b78c2e0f09e48f5f9b2b2084f3
MD5 71d5c6d6ef230231481bfe66031f6f62
BLAKE2b-256 5e8d2743d1944b793b1182955de2143776ec786c239e6c69fb8da18fb4db17ed

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 89982a0f9a3d6ee6671d2a45a3d26e27ad09620f0fca34b3f53f6a89098b8ecd
MD5 3d06c96b96352767d2d9817bb69b73c4
BLAKE2b-256 7b72119583107572a60f86e0f8578a7bf5d03928eeb3107d3e469336c5c41245

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 45b22fa4b8a70d1d687e4553192d29ee770de2b19c3c93a77dc2dd1b9fa4f58b
MD5 337fe7c4860915b993c62840ba9491c8
BLAKE2b-256 7ce93a1f5a5aa46b482a3bca632fff9cb173c91ba5f02a96a3f5c0d730660519

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-pp36-pypy36_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-pp36-pypy36_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2d6880834f5aa4b48bf161bfd577a6496ecf26b26c14ce4b04c0781c735b6860
MD5 39636a51a17a316a86c837986c13f4f9
BLAKE2b-256 b52dcd5b9729a37b0cdbe932461c79f74c75d17ac665f080f95633a8c275852d

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 2747651528b8392cb8d221a59fb9369f29e147b63a245b6ecaf2ce2428c9c492
MD5 9cacaf733be793796ebc514b07ef5276
BLAKE2b-256 6024ff258c00e8a3d8f9d0f3a34f32f3405bee99ef0339a6fc4d529b6970cc49

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp310-cp310-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp310-cp310-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 44c5994b13f87d8ef564976d8371c96243d53377c0eb7eace78cb5b0690fbcd5
MD5 c2193d1f1364faf2e379ed98181e69d0
BLAKE2b-256 8381e44ca051aa0c14b3d371da0e11c53b8c97915c587a4d2d8958b761bca739

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 c0803280408fee9738b71ab52b29998563840e093922c11b5f72b88b486d52d3
MD5 71635781c7a7b6c11ef359e9adfa33f7
BLAKE2b-256 a0ad68a14e424200099462bd97c58fefe3b428a6522ae11b2e2b195bbbd6d15a

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp39-cp39-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 2c0462a34cb3bcc62f1bb861a79d84beea453f8df1b9cd947f5692a982c792ff
MD5 807787e2888cc8f800ac509f471ca0d8
BLAKE2b-256 e124e5cba7b51dc391c8bc45dd7f98e61804088f8cc35346b362dedc94fae471

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 0add9591449fa3ba7e00694a4f6e7bcbf8ba3fdfd12968696cde45cd48d181c2
MD5 be61d1a185a3a3865aed4618f972e081
BLAKE2b-256 b46a8a3467282158fac08ea0dc646378789abe5dfa9e77af5f25a1fb658e3a3e

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a8b1667d4e90eab1aedc744840d7e104524f209b05c9fff5e35ec5e92c0f65e6
MD5 a5ecdfad61520f7605b2f3b2569aee8e
BLAKE2b-256 b085ec2e8f9b2eb9ac3c6a71f1a9cd29c43ffa2a0bdf23da96dcd6c8135b5a88

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 1040559f17eec8fa3f7f0a19138c3bbe79a67238de4e89b0967569af657de545
MD5 ef90e63fed824cb6c6b3b5548ce862ea
BLAKE2b-256 c6379e560cd8eb371df3b31e9e60b9503b51155607f8ca1948cb4009a0c47519

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0ad2559975cca19011be9c89bb1c802865a154aead10699e7a2e5ecdd468954d
MD5 74c800e2cc9434c0164f7af3bd24130c
BLAKE2b-256 e216950a9891b018accdc4cf2eac23cd4b2cc64eced8b6774a206ac9bdcfe79f

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 491ee8ec3ba784571a42e711d331a29a268f9584bd0e39c4012d625ff1002b95
MD5 e4b3491d61866a58b40f9a7fc5815c31
BLAKE2b-256 aacbe9b4113ab1049e1f13828043ee41dec823111fa0d1ba95b8ed17aa2479da

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.0-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.0-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4f24ef0a662387030e8476666c34a64613a7e53cbc05f1c308a55f18aec58c8c
MD5 14044d40bc033e40b353d505a2684551
BLAKE2b-256 98b44fb7c0d1956678b92ba1deac8e1f26fe6391e362a88aac7a1c6743fa85e2

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page