Skip to main content

Hidden Markov Model profile tools (reader/writer/data structures)

Project description

HMM_profile

License: MIT Actions Status Wheel Status Supported Python versions PyPI - Status Latest version

Hidden Markov Model profile toolkit.

Written in the base of HMMER User's Guide p.107.

Usage

With my package you can read and write hmm profile files. It's easy to use and easy to read - the best documentation is a well-written code itself, so don't be scared about reading source code.

Reader

Read all hmm from file

The read_all function returns generator to optimise memory usage - it's a common pattern that one file contains many profiles.

from hmm_profile import reader


with open('/your/hmm/profile/file.hmm') as f:
    model_generator = reader.read_all(f)  # IMPORTANT: returns generator

profiles = list(model_generator)

Read single model

If you have only single model files, you can use this method. It will return models.HMM ready to use.

from hmm_profile import reader


with open('/your/hmm/profile/file.hmm') as f:
    model = reader.read_single(f) 

Writer

Write multiple profiles to single file

from hmm_profile import writer

profiles = [...]
path = '/your/hmm/profile/file.hmm'

writer.save_many_to_file(hmms=profiles, output=path)

Write single model to file

from hmm_profile import writer

model = ...
path = '/your/hmm/profile/file.hmm'

writer.save_to_file(hmm=model, output=path)

Get file content without saving

from hmm_profile import writer

model = ...

lines = writer.get_lines(model)  # IMPORTANT: returns generator
content = ''.join(lines)

Support/bugs

If you have a file that is not readable or has some glitches on save, please crate the issue and attach this file. Bug reports without files (or good examples if you can't provide full file) will be ignored.

Guarantees

Full database test

Above you can see if all hmm profiles from Pfam works. Test are running every day.

Test flow:

  1. Download all hmm profiles from Pfam.
  2. Load profiles sequentially.
  3. Write model to file.
  4. Load saved model from file.
  5. Check if both loaded profiles are equals.

For this test the latest version of hmm_profile from pypi is used.

Full DB test also runs before each release, but badge above shows only periodic tests results.

Performance

Whole package is written in pure Python, without C extensions.

You can treat full DB test as benchmark.

Benchmark should be depended mainly on single core of CPU and secondarily on storage and eventually on RAM. Storage is used only for read from then files will be saved to "in-memory file" (StringIO).

Remember: Results may vary when CPU is under load. Also, hmm profiles in db can be modified in future or some profiles may be added/removed from DB.

Processor Storage Time [s] Profiles Date Version Python
Intel Core i7-4702MQ Crucial MX500 500 GB 342 17928 2020.02.22 0.0.9 3.7
Intel Core i7-4702MQ Crucial MX500 500 GB 322 17928 2020.02.22 0.0.9 3.6
Intel Core i7-4702MQ GoodRAM Iridium Pro240 GB TBA TBA TBA TBA 3.6

To run benchmark:

pip install .
export HMM_PROFILE_RUN_INTEGRITY_TESTS=TRUE
python setpu.py test --addopts -s

Run test at least 3 times if you want to share results (last line) and close as much process as possible. Important: do not run tests inside so-called terminal in IDE - it will do much more job with output and benchmark result will be affected.

As you can see python 3.6 is a little faster, probably due to different implementation of backported dataclasses, but I'm not sure.

Development

Release

  1. Change version in setup.py to x.y.z.dev0 (or leave if minor version bump) and ensure changelog is up to date. (Nothing changed yet. is not ok, CI will fail)
  2. Tag head of master branch with x.y.z without .dev0

Important: release ALWAYS is from master branch! So keep master untouched when you want to release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hmm_profile-0.0.13.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

hmm_profile-0.0.13-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file hmm_profile-0.0.13.tar.gz.

File metadata

  • Download URL: hmm_profile-0.0.13.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for hmm_profile-0.0.13.tar.gz
Algorithm Hash digest
SHA256 60f8cd8a936ac33d5bab46e547a238f218ea35bd4bf4b530ab468fd67cc2337f
MD5 4a8c42210956724ac1b1c98d1f224934
BLAKE2b-256 937c420fcfbef82a4a213d1ba23ab9850fb8d4750f170b6d0c6bf2ff5b8c3586

See more details on using hashes here.

File details

Details for the file hmm_profile-0.0.13-py3-none-any.whl.

File metadata

  • Download URL: hmm_profile-0.0.13-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for hmm_profile-0.0.13-py3-none-any.whl
Algorithm Hash digest
SHA256 43b2ee1827c4974bae62da4be912201b50cd9e867600e22d21ee7d707837f64e
MD5 f9409c4e69831ac5f2c632fd436015c5
BLAKE2b-256 e63aab20f828d44e389468e4f1722886831b4295af29174a40ca75d901a64f69

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page