High-performance FASTQ parsing with Mojo-backed Python bindings
Project description
blazeseq (Python)
Python bindings for BlazeSeq — high-performance FASTQ parsing.
Wheels only: install from PyPI. No source build of the extension.
# Install (uv recommended)
uv pip install blazeseq
# Or with pip
pip install blazeseq
Quick start
import blazeseq
# quality_schema defaults to "generic"; use keyword args for clarity
parser = blazeseq.parser("file.fastq", quality_schema="sanger")
while parser.has_more():
rec = parser.next_record()
print(rec.id, rec.sequence)
Or use the iterator over records:
for rec in parser.records:
print(rec.id, rec.sequence)
For batched iteration (default 100 records per batch):
for batch in parser.batches:
for rec in batch:
print(rec.id, rec.sequence)
Custom batch size:
for batch in parser.batches_with_size(50):
for rec in batch:
print(rec.id, rec.sequence)
Gzip files: use .fastq.gz or .fq.gz and set parallelism for decompression threads (default 4):
parser = blazeseq.parser("reads.fastq.gz", quality_schema="sanger", parallelism=8)
for rec in parser.records:
print(rec.id, rec.sequence)
API reference
Module-level
| Function | Description |
|---|---|
parser(path, quality_schema="generic", parallelism=4) |
Create a FASTQ parser. Supports .fastq, .fq, .fastq.gz, .fq.gz. quality_schema: "generic", "sanger", "solexa", "illumina_1.3", "illumina_1.5", "illumina_1.8". parallelism: decompression threads for gzip (default 4). Returns a parser supporting records, batches, batches_with_size(n), has_more(), next_record(), next_batch(n). |
Parser (returned by parser / create_parser)
| Method / attribute | Description |
|---|---|
has_more() |
Return True if there may be more records to read. |
next_record() |
Return the next record as a FastqRecord. Raises on EOF or parse error. |
next_ref_as_record() |
Return the next record (from zero-copy ref) as a FastqRecord. Raises on EOF or parse error. |
next_batch(max_records) |
Return a batch of up to max_records records as a FastqBatch. Returns a partial batch at EOF. |
records |
Iterable over records: for rec in parser.records. |
batches |
Iterable over batches (default 100 records per batch): for batch in parser.batches then for rec in batch. |
batches_with_size(batch_size) |
Iterable over batches of the given size. |
__iter__ / __next__ |
Iterator protocol; equivalent to iterating over records. |
FastqRecord
| Property / method | Description |
|---|---|
id |
Read identifier (without leading @). |
sequence |
Sequence line (bases). |
quality |
Quality line (raw quality string). |
__len__() |
Sequence length (number of bases). |
phred_scores |
Phred quality scores as a Python list of integers. |
FastqBatch
| Method | Description |
|---|---|
num_records() |
Number of records in the batch. |
get_record(index) |
Return the record at the given index as a FastqRecord. |
__iter__ |
Iterate over records: for rec in batch. |
Local development (uv)
From the repo root, after building the Mojo extension into python/blazeseq/_extension/:
uv pip install -e python/
uv run python tests/test_python_bindings.py
Build and upload to PyPI
Prerequisites: Build the Mojo extension so that python/blazeseq/_extension/ contains the platform wheel (.so). Ensure version in pyproject.toml is bumped for releases.
-
Install build tools and twine:
uv pip install build twine
-
Build the package (from the
python/directory):cd python uv run python -m build
This produces
dist/blazeseq-<version>.tar.gz(sdist) anddist/blazeseq-<version>-*.whl(wheel). -
Upload to PyPI (use a PyPI API token; create one at pypi.org):
uv run twine upload dist/*
For Test PyPI first:
uv run twine upload --repository testpypi dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file blazeseq-0.3.0-py3-none-any.whl.
File metadata
- Download URL: blazeseq-0.3.0-py3-none-any.whl
- Upload date:
- Size: 133.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95f317d88e52b5ee1c6d319b961d38c856589f7275530124c4cc5b0ed3370771
|
|
| MD5 |
0f2e2779f38385e8ce4ce68743e6481b
|
|
| BLAKE2b-256 |
b6c8917ab12d114931f5b6042eb7e31982a603d9bc6165b7f88cb2cbed162a2e
|