Skip to main content

Cython bindings and Python interface to HMMER3.

Project description

🐍🟡♦️🟦 PyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda AUR Wheel Python Versions Python Implementations License Source Mirror GitHub issues Docs Changelog Downloads DOI

🗺️ Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is maintained by members of the the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python module, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers (like hmmer-py):

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.
  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.
  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a DigitalSequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).
  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).
  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically as fast as directly using the hmmsearch binary (see the Benchmarks section). pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which can help getting the most of multiple CPUs when annotating smaller sequence databases.

This library is still a work-in-progress, and in an experimental stage, but it should already pack enough features to run biological analyses or workflows involving hmmsearch, hmmscan, nhmmer, phmmer, hmmbuild and hmmalign.

🔧 Installing

pyhmmer can be installed from PyPI, which hosts some pre-built CPython wheels for x86-64 Linux, as well as the code required to compile from source with Cython:

$ pip install pyhmmer

Compilation for UNIX PowerPC is not tested in CI, but should work out of the box. Other architectures (e.g. Arm) and OSes (e.g. Windows) are not supported by HMMER.

A Bioconda package is also available:

$ conda install -c bioconda pyhmmer

📖 Documentation

A complete API reference can be found in the online documentation, or directly from the command line using pydoc:

$ pydoc pyhmmer.easel
$ pydoc pyhmmer.plan7

💡 Example

Use pyhmmer to run hmmsearch, and obtain an iterable over TopHits that can be used for further sorting/querying in Python. Processing happens in parallel using Python threads, and a TopHits object is yielded for every HMM passed in the input iterable.

import pyhmmer

with pyhmmer.easel.SequenceFile("tests/data/seqs/938293.PRJEB85.HG003687.faa", digital=True) as seq_file:
    sequences = list(seq_file)

with pyhmmer.plan7.HMMFile("tests/data/hmms/txt/t2pks.hmm") as hmm_file:
    all_hits = list(pyhmmer.hmmsearch(hmm_file, sequences_file, cpus=4))

Have a look at more in-depth examples such as building a HMM from an alignment, analysing the active site of a hit, or fetching marker genes from a genome in the Examples page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⏱️ Benchmarks

Benchmarks were run on a i7-10710U CPU running @1.10GHz with 6 physical / 12 logical cores, using a FASTA file containing 2,100 protein sequences extracted from the genome of Anaerococcus provencensis (938293.PRJEB85.HG003687.faa) and the version 33.1 of the Pfam HMM library containing 18,259 domains. Commands were run 4 times on a warm SSD. Plain lines show the times for pressed HMMs, and dashed-lines the times for HMMs in text format.

Benchmarks

Raw numbers can be found in the benches folder. They suggest that phmmer should be run with the number of logical cores, while hmmsearch should be run with the number of physical cores (or less). A possible explanation for this observation would be that HMMER platform-specific code requires too many SIMD registers per thread to benefit from simultaneous multi-threading.

To read more about how PyHMMER achieves better parallelism than HMMER for many-to-many searches, have a look at the Performance page of the documentation.

🔍 See Also

Building a HMM from scratch? Then you may be interested in the pyfamsa package, providing bindings to FAMSA, a very fast multiple sequence aligner. In addition, you may want to trim alignments: in that case, consider pytrimal, which wraps trimAl 2.0.

If despite of all the advantages listed earlier, you would rather use HMMER through its CLI, this package will not be of great help. You can instead check the hmmer-py package developed by Danilo Horta at the EMBL-EBI.

⚖️ License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license. See vendor/hmmer/LICENSE and vendor/easel/LICENSE for more information.

This project is in no way affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhmmer-0.6.2.tar.gz (9.9 MB view details)

Uploaded Source

Built Distributions

pyhmmer-0.6.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (9.2 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.6.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (9.2 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.6.2-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (9.4 MB view details)

Uploaded PyPy manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

pyhmmer-0.6.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (9.2 MB view details)

Uploaded PyPy macOS 10.9+ x86-64

pyhmmer-0.6.2-pp36-pypy36_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (9.4 MB view details)

Uploaded PyPy manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

pyhmmer-0.6.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (14.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.2-cp310-cp310-macosx_10_15_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.10 macOS 10.15+ x86-64

pyhmmer-0.6.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (14.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.2-cp39-cp39-macosx_10_15_x86_64.whl (9.8 MB view details)

Uploaded CPython 3.9 macOS 10.15+ x86-64

pyhmmer-0.6.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (14.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.2-cp38-cp38-macosx_10_15_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

pyhmmer-0.6.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (14.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.2-cp37-cp37m-macosx_10_15_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

pyhmmer-0.6.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (14.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

pyhmmer-0.6.2-cp36-cp36m-macosx_10_14_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file pyhmmer-0.6.2.tar.gz.

File metadata

  • Download URL: pyhmmer-0.6.2.tar.gz
  • Upload date:
  • Size: 9.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for pyhmmer-0.6.2.tar.gz
Algorithm Hash digest
SHA256 f90883581bd0d03c3572886dacba415144bec6f3c26e74b43ec832b6688d4170
MD5 d5a699ae7fff4f98465de7acc1ce8b82
BLAKE2b-256 266af2618f79593b64a409dba66026d7fc921ae8837726f0085da1707239a8e4

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 bda923d1b100ddc038f3753fd1c8ae519f32e8d56b5576327159c8a7893e042b
MD5 35661f64e767d0d49855757cf10b2bb4
BLAKE2b-256 41b8cc677cadee7a7014e6700c87d8ff24d33d9118b6d1564b6b2660ddb7ca17

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0da09af6bdd5ee3928d919d2332bb273350f3ffbbe4dbcbbe2a39ec04d405eb7
MD5 909b7563196844f4aab2cb4f279ce252
BLAKE2b-256 e8420642f761e61259b3c5263fe33ff1b94fa5db04f416c2bde2b620daca9b59

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 dbee80141540e34f405eb8622cb382032632d6e43b39403e91a5a9a45f814b76
MD5 6a6d4d8d131646830b589049724f2e9d
BLAKE2b-256 a742f72235b8fff6279f6f597c4766b42e5bf4d5f99b8952aad09262c191cb0d

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 799c00a80c198b4be71af4a91988c95a6606f75afb63cedfc096686542a377dc
MD5 fb6451ddab462eaabce9a3650a5b3d4b
BLAKE2b-256 7cae0b5f48c0b450b59edc60b80673c1c1d3a476fd0e1a67227c98bc24f5e119

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-pp36-pypy36_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-pp36-pypy36_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 62c15108fd9dbce57437631ef27eb6b46e754c7eb90e9d021a39ce91c2711d12
MD5 2798e2ea909ba02de9c1d3f9bd515d28
BLAKE2b-256 d6e14d9ff4eeaef082868cd75110599e2b123191d2d28570b52f837c7b4e8c31

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 1f086415d2d5b9db095b270dcdbf0e3d5542153a0adb49cdb575628e25948000
MD5 bea2f3fbdee50d66c75f04ef1f97b068
BLAKE2b-256 5a7e0ae123e0917f504cd5a286eef569469a4da0bdcd9c8dd2472c95b9cd52d0

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp310-cp310-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp310-cp310-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 f968fa2653554825e36c08e1191b24c3263969c423010c24a0ee7e0626acf7b5
MD5 53f061b9f0473cd05dcc6fa461bde5a0
BLAKE2b-256 1a58bd0cf943e8530a77288c09068655fc165b0c0b8e92c13a504236cef318f4

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 b07df6467b8403fcea03737c51bf0b9349c0d8e20326dc501493b2159f436da0
MD5 cf7947fa51d8defac4b306a4ef46fb64
BLAKE2b-256 bba1103087f3aba399a852295b2865a66357570a2c0a5c1c98ae1cce916ab468

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp39-cp39-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 cf100c38f1757aadb43052adaaba75f4a7f509f9f557b77a2cb162593f8cdeac
MD5 6a7c99df31276393d0f158f496e1e40b
BLAKE2b-256 2eb8a5edf3701cb051f9b8f397cc0ad5312bad4eb6eb761c99ec294b21c922f7

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 3bdd95f9527be0b32eaec426af3ee01bf1ab84469c3d715ef09aae114544c95b
MD5 bda253bce90ea444b0b63f1c4c04b4fa
BLAKE2b-256 60a04d7a8f68dc85cd4cf70a39afc592ab67a2e703149887dc78f4d2031a3d1c

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 381e5318ea01bffcd17eab8347627d2f5a1e65a8c078d624ffa65c6cf675e125
MD5 b787da22b97142119ebc09bb5ae549d7
BLAKE2b-256 254a2db5464d2bb8e41bb346060fd9c3ceae2043c60af6621f04c8e49dbecae9

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 b30f4f3fe25f3a35b32d4405f03b306cb76ab6bf10e2c0de6b2443ec292564e9
MD5 db319db6de9a4acece530dbaae6b3bc4
BLAKE2b-256 66b1957e1be040cd2b5778941c9a4b8caf283e328dc4608c61b54dc4f44dd677

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 b0028aabd9b9272184cc236dd6d9e1b4b2ff1676fb57a4f1be3d55de50daa54e
MD5 9a83216ea4338200acdca691c239bf91
BLAKE2b-256 74c92ffa4ca300bc0effd026daaca1db9ccd718551439b73f4a792ef57ec38fe

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 a964871a59c9296e6f5c4b0426bb76ea65a399dda934324738a8d6f6c04ca55d
MD5 c811354c5cc21d0d67f7862657bc374e
BLAKE2b-256 95f2bd50114e9224a290b2a68c9651b9e35a950656a6829307a1a4a432fd45be

See more details on using hashes here.

Provenance

File details

Details for the file pyhmmer-0.6.2-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for pyhmmer-0.6.2-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8e29c1d16cfcbe47b82d23dad39d64da876d3ee55e207a40677def74576cbfbc
MD5 999b2c47dae9ed435be2ecb2ea6e4fd0
BLAKE2b-256 a8f972faeef1934a86623f8720070d2bfcd4518492e95d6afedf7cb093f06b9c

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page