A native python binding for blat suit
Project description
PxBLAT
An Efficient and Ergonomic Python Binding Library for BLAT
Why PxBLAT?
When conducting extensive queries, using the blat
of BLAT
suit can prove to be quite inefficient, especially if these operations aren't grouped. The tasks are allocated sporadically, often interspersed among other tasks.
In general, the choice narrows down to either utilizing blat
or combining gfServer
with gfClient
.
Indeed, blat
is a program that launches gfServer
, conducts the sequence query via gfClient
, and then proceeds to terminate the server.
This approach is far from ideal when performing numerous queries that aren't grouped since blat
repeatedly initializes and shuts down gfServer
for each query, resulting in substantial overhead.
This overhead consists of the time required for the server to index the reference, contingent on the reference's size.
To index the human genome (hg38), for example, would take approximately five minutes.
A more efficient solution would involve initializing gfServer
once and invoking gfClient
multiple times for the queries.
However, gfServer
and gfClient
are only accessible via the command line.
This necessitates managing system calls (for instance, subprocess
or os.system
), intermediate temporary files, and format conversion, further diminishing performance.
That is why PxBLAT
holds its position.
It resolves the issues mentioned above while introducing handy features like port retry
, use current running server
, etc.
📚 Table of Contents
🔮 Features
- Zero System Calls: Avoids system calls, leading to a smoother, quicker operation.
- Ergonomics: With an ergonomic design,
PxBLAT
aims for a seamless user experience. - No External Dependencies:
PxBLAT
operates independently without any external dependencies. - Self-Monitoring: No need to trawl through log files;
PxBLAT
monitors its status internally. - Robust Validation: Extensively tested to ensure reliable performance and superior stability as BLAT.
- Format-Agnostic:
PxBLAT
doesn't require you to worry about file formats. - In-Memory Processing:
PxBLAT
discards the need for intermediate files by doing all its operations in memory, ensuring speed and efficiency.
📎 Citation
PxBLAT is scientific software, with a published paper in the BioRxiv. Check the published to read the paper.
@article {Li2023pxblat,
author = {Yangyang Li and Rendong Yang},
title = {PxBLAT: An Ergonomic and Efficient Python Binding Library for BLAT},
elocation-id = {2023.08.02.551686},
year = {2023},
doi = {10.1101/2023.08.02.551686},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Summary: We introduce PxBLAT, a Python library designed to enhance usability and efficiency in interacting with the BLAST-like alignment tool (BLAT). PxBLAT provides an intuitive application programming interface (API) design, allowing the incorporation of its functionality directly into Python-based bioinformatics workflows. Besides, it integrates seamlessly with Biopython and comes equipped with user-centric features like server readiness checks and port retry mechanisms. PxBLAT removes the necessity for system calls and intermediate files, as well as reducing latency and data conversion overhead. Benchmark tests reveal PxBLAT gains a ~20\% performance boost compared to BLAT in the Python environment. Availability and Implementation: PxBLAT supports Python (version 3.8+), and pre-compiled packages are released via PyPI (https://pypi.org/project/ pxblat/) and Bioconda (https://anaconda.org/ bioconda/pxblat). The source code of PxBLAT is available under the terms of an open-source MIT license and hosted on GitHub (https:// github.com/ylab-hi/pxblat). Its documentation is available on ReadTheDocs (https://pxblat. readthedocs.io/en/latest/).Competing Interest StatementThe authors have declared no competing interest.},
URL = {https://www.biorxiv.org/content/early/2023/08/05/2023.08.02.551686},
eprint = {https://www.biorxiv.org/content/early/2023/08/05/2023.08.02.551686.full.pdf},
journal = {bioRxiv}
}
🚀 Getting Started
Please see the document for details and more examples.
🤝 Contributing
Contributions are always welcome! Please follow these steps:
- Fork the project repository. This creates a copy of the project on your account that you can modify without affecting the original project.
- Clone the forked repository to your local machine using a Git client like Git or GitHub Desktop.
- Create a new branch with a descriptive name (e.g.,
new-feature-branch
orbugfix-issue-123
).
git checkout -b new-feature-branch
- Take changes to the project's codebase.
- Install the latest package
poetry install
- Test your changes
pytest -vlsx tests
- Commit your changes to your local branch with a clear commit message that explains the changes you've made.
git commit -m 'Implemented new feature.'
- Push your changes to your forked repository on GitHub using the following command
git push origin new-feature-branch
Create a pull request to the original repository. Open a new pull request to the original project repository. In the pull request, describe the changes you've made and why they're necessary. The project maintainers will review your changes and provide feedback or merge them into the main branch.
🪪 License
PxBLAT is modified from blat, the license is the same as blat. The source code and executables are freely available for academic, nonprofit, and personal use. Commercial licensing information is available on the Kent Informatics website (https://kentinformatics.com/).
Contributors
yangliz5 🚧 |
Joshua Zhuang 🚇 |
🙏 Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file pxblat-0.3.10.tar.gz
.
File metadata
- Download URL: pxblat-0.3.10.tar.gz
- Upload date:
- Size: 626.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3ff4c8684ef47a2d642a1c72f853b8722d601cc5f50bf38ce583b042a550a28 |
|
MD5 | 35e275e9b85321a2601bd679847ad040 |
|
BLAKE2b-256 | 8eb12928f58ea35b0ac1497aac05367d29b8b210c38336c5e2cd110ab47524ac |
File details
Details for the file pxblat-0.3.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: pxblat-0.3.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 6.7 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c39794bee39ef9bc91415c25a8ac28885d6ff6468363dd275f8066cdb0a2c685 |
|
MD5 | 6e6ab6b19a1131f94e62079c287f290c |
|
BLAKE2b-256 | efab77f9050153cafd564889ab51feb430f2d7f374a7dc1c7e2b70bd69d86cfa |
File details
Details for the file pxblat-0.3.10-cp311-cp311-macosx_12_0_x86_64.whl
.
File metadata
- Download URL: pxblat-0.3.10-cp311-cp311-macosx_12_0_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.11, macOS 12.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 573772cd3d55ff42d0fe0f62081542d2d7508e172103dbb3dd6d221967113ae1 |
|
MD5 | 0143917fa5147913af2629bd3d6b22a6 |
|
BLAKE2b-256 | 0b500034576296ba282ffff873ce15c7729415b7fc18c26f2dc66304bec2c3fa |
File details
Details for the file pxblat-0.3.10-cp311-cp311-macosx_12_0_arm64.whl
.
File metadata
- Download URL: pxblat-0.3.10-cp311-cp311-macosx_12_0_arm64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.11, macOS 12.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64b1b492df83db9c4e45561c1d2c0f5520eb480d05905fc3a87a3e732618e00a |
|
MD5 | 6cc3a03f797a37c71f83227e1de04374 |
|
BLAKE2b-256 | 6bde0bb8df084ca3249f2e85ddd6753fd2d345e4fea3d45fb8602bc9d28bf7a7 |
File details
Details for the file pxblat-0.3.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: pxblat-0.3.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 6.7 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc0bd6e288ccdbf20392339589f245ac81952e32112325186b52b8613eb00d3a |
|
MD5 | 0f34ce6d7db1826ff34c2e1f3c61db44 |
|
BLAKE2b-256 | b6d189703353e40ad33440ea090c259e49b2db58402d12e8ad927d73e4b20a98 |
File details
Details for the file pxblat-0.3.10-cp310-cp310-macosx_12_0_x86_64.whl
.
File metadata
- Download URL: pxblat-0.3.10-cp310-cp310-macosx_12_0_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.10, macOS 12.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5a67717270cc0829d51e53701f4748da6e9532f2ad2ecbdd553ff18d8ac1464 |
|
MD5 | 74c24e99cf21fd6fa831fd4b766d58bc |
|
BLAKE2b-256 | ed680996372e4620b34d8ecdc68fd548d8729b707efc9a01b77312a26147459b |
File details
Details for the file pxblat-0.3.10-cp310-cp310-macosx_12_0_arm64.whl
.
File metadata
- Download URL: pxblat-0.3.10-cp310-cp310-macosx_12_0_arm64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.10, macOS 12.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 069f4ea5a91c2b8dde7a303ea2fb22645292ec4a6c7413d02588b2208b36ad7e |
|
MD5 | d6f1f6404e520ede05a6d8208572c67e |
|
BLAKE2b-256 | e3c1c19c45b20c33f46657ff039956ee3e4886c36d07a4d55819d54579c000a0 |
File details
Details for the file pxblat-0.3.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: pxblat-0.3.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 6.7 MB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc65c62afaba66b74af6eea58bcf465a8ee2330d22a08b90f7c6edf7ccc13996 |
|
MD5 | 1f3cd9ebfeab6f1c71b28178bdba2fc1 |
|
BLAKE2b-256 | 3e08206806ba0007840b98aab1787a047f7c46ee647599471a2b2618b9960d5d |
File details
Details for the file pxblat-0.3.10-cp39-cp39-macosx_12_0_x86_64.whl
.
File metadata
- Download URL: pxblat-0.3.10-cp39-cp39-macosx_12_0_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.9, macOS 12.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21c7e0cd9398abf8137a3b5da5f89ae6b5c4dc3afa5502e0ae888d093c25b861 |
|
MD5 | 24577fa900fadde9291e5d2dd4fa8250 |
|
BLAKE2b-256 | 63188ce319bd8f37970f31dceb1917333bfe6cc5b113d8b64657dcabd060c197 |
File details
Details for the file pxblat-0.3.10-cp39-cp39-macosx_12_0_arm64.whl
.
File metadata
- Download URL: pxblat-0.3.10-cp39-cp39-macosx_12_0_arm64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.9, macOS 12.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a254291113df358be9b38ffdb7d2680093f30a34725de2b08daf5065ea40314d |
|
MD5 | a6cf4a60591c48b522f899a3718b2100 |
|
BLAKE2b-256 | b2fe586c6de5f7191812315ec25b83e05104492ab6ad31f3ef7883181dcb838f |