Skip to main content

A native python binding for blat suit

Project description

logo PxBLAT social

An Efficient and Ergonomic Python Binding Library for BLAT

python c++ c pypi conda pyversion tests Codecov docs download condadownload precommit black Ruff release open-issue close-issue activity lastcommit opull all contributors Open In Colab

Why PxBLAT?

When conducting extensive queries, using the blat of BLAT suit can prove to be quite inefficient, especially if these operations aren't grouped. The tasks are allocated sporadically, often interspersed among other tasks. In general, the choice narrows down to either utilizing blat or combining gfServer with gfClient. Indeed, blat is a program that launches gfServer, conducts the sequence query via gfClient, and then proceeds to terminate the server.

This approach is far from ideal when performing numerous queries that aren't grouped since blat repeatedly initializes and shuts down gfServer for each query, resulting in substantial overhead. This overhead consists of the time required for the server to index the reference, contingent on the reference's size. To index the human genome (hg38), for example, would take approximately five minutes.

A more efficient solution would involve initializing gfServer once and invoking gfClient multiple times for the queries. However, gfServer and gfClient are only accessible via the command line. This necessitates managing system calls (for instance, subprocess or os.system), intermediate temporary files, and format conversion, further diminishing performance.

That is why PxBLAT holds its position. It resolves the issues mentioned above while introducing handy features like port retry, use current running server, etc.

📚 Table of Contents

🔮 Features

  • Zero System Calls: Avoids system calls, leading to a smoother, quicker operation.
  • Ergonomics: With an ergonomic design, PxBLAT aims for a seamless user experience.
  • No External Dependencies: PxBLAT operates independently without any external dependencies.
  • Self-Monitoring: No need to trawl through log files; PxBLAT monitors its status internally.
  • Robust Validation: Extensively tested to ensure reliable performance and superior stability as BLAT.
  • Format-Agnostic: PxBLAT doesn't require you to worry about file formats.
  • In-Memory Processing: PxBLAT discards the need for intermediate files by doing all its operations in memory, ensuring speed and efficiency.

📎 Citation

PxBLAT is scientific software, with a published paper in the BioRxiv. Check the published to read the paper.

@article {Li2023pxblat,
	author = {Yangyang Li and Rendong Yang},
	title = {PxBLAT: An Ergonomic and Efficient Python Binding Library for BLAT},
	elocation-id = {2023.08.02.551686},
	year = {2023},
	doi = {10.1101/2023.08.02.551686},
	publisher = {Cold Spring Harbor Laboratory},
	abstract = {Summary: We introduce PxBLAT, a Python library designed to enhance usability and efficiency in interacting with the BLAST-like alignment tool (BLAT). PxBLAT provides an intuitive application programming interface (API) design, allowing the incorporation of its functionality directly into Python-based bioinformatics workflows. Besides, it integrates seamlessly with Biopython and comes equipped with user-centric features like server readiness checks and port retry mechanisms. PxBLAT removes the necessity for system calls and intermediate files, as well as reducing latency and data conversion overhead. Benchmark tests reveal PxBLAT gains a ~20\% performance boost compared to BLAT in the Python environment. Availability and Implementation: PxBLAT supports Python (version 3.8+), and pre-compiled packages are released via PyPI (https://pypi.org/project/ pxblat/) and Bioconda (https://anaconda.org/ bioconda/pxblat). The source code of PxBLAT is available under the terms of an open-source MIT license and hosted on GitHub (https:// github.com/ylab-hi/pxblat). Its documentation is available on ReadTheDocs (https://pxblat. readthedocs.io/en/latest/).Competing Interest StatementThe authors have declared no competing interest.},
	URL = {https://www.biorxiv.org/content/early/2023/08/05/2023.08.02.551686},
	eprint = {https://www.biorxiv.org/content/early/2023/08/05/2023.08.02.551686.full.pdf},
	journal = {bioRxiv}
}

🚀 Getting Started

Please see the document for details and more examples.

🤝 Contributing

Contributions are always welcome! Please follow these steps:

  1. Fork the project repository. This creates a copy of the project on your account that you can modify without affecting the original project.
  2. Clone the forked repository to your local machine using a Git client like Git or GitHub Desktop.
  3. Create a new branch with a descriptive name (e.g., new-feature-branch or bugfix-issue-123).
git checkout -b new-feature-branch
  1. Take changes to the project's codebase.
  2. Install the latest package
poetry install
  1. Test your changes
pytest -vlsx tests
  1. Commit your changes to your local branch with a clear commit message that explains the changes you've made.
git commit -m 'Implemented new feature.'
  1. Push your changes to your forked repository on GitHub using the following command
git push origin new-feature-branch

Create a pull request to the original repository. Open a new pull request to the original project repository. In the pull request, describe the changes you've made and why they're necessary. The project maintainers will review your changes and provide feedback or merge them into the main branch.

🪪 License

PxBLAT is modified from blat, the license is the same as blat. The source code and executables are freely available for academic, nonprofit, and personal use. Commercial licensing information is available on the Kent Informatics website (https://kentinformatics.com/).

Contributors

yangliz5
yangliz5

🚧
Joshua Zhuang
Joshua Zhuang

🚇

🙏 Acknowledgments


Star History Chart

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pxblat-0.3.6.tar.gz (626.3 kB view details)

Uploaded Source

Built Distribution

pxblat-0.3.6-cp310-cp310-manylinux_2_35_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.35+ x86-64

File details

Details for the file pxblat-0.3.6.tar.gz.

File metadata

  • Download URL: pxblat-0.3.6.tar.gz
  • Upload date:
  • Size: 626.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for pxblat-0.3.6.tar.gz
Algorithm Hash digest
SHA256 d83a1dc414789f7da59d1bf490d20f6544c7ddaa41df5da6d751cc8189c94b2f
MD5 0f0a9d77fec20d751d10ba6ba636412c
BLAKE2b-256 98eb63dd8db226a5d4cc5230f6beaaef3d97381add475b603d4942431786b508

See more details on using hashes here.

File details

Details for the file pxblat-0.3.6-cp310-cp310-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for pxblat-0.3.6-cp310-cp310-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 4594c6262703bb3b3e6a8b59f673017fca3cf0c862de11e6a5f28dfdd733bb7d
MD5 478840829521255b82e3883d6184a36d
BLAKE2b-256 8b37f0456df44427ae8f3a2211adb9ba9a0f10b7ef4b8d84e8b652bf43f034b9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page