Skip to main content

A Zarr v3 convention for per-base resolution genomic data

Project description

pbzarr

A Python library for PBZ (Per-Base Zarr) — a Zarr v3 convention for storing per-base resolution genomic data such as read depths, methylation levels, and boolean masks.

PBZ is a modern alternative to D4 and bigWig, leveraging the Zarr ecosystem for compression, chunking, and cloud-native access.

Installation

pip install pbzarr

With optional Dask support:

pip install pbzarr[dask]

Quick Start

import pbzarr

# Create a store
store = pbzarr.create(
    "sample.pbz.zarr",
    contigs=["chr1", "chr2"],
    contig_lengths=[248_956_422, 242_193_529],
)

# Add a track
track = store.create_track("depths", dtype="uint32", columns=["sample_A", "sample_B"])

# Write data
import numpy as np
track["chr1", 0:1000] = np.random.randint(0, 100, size=(1000, 2), dtype="uint32")

# Query data
data = track.query("chr1:0-1000", columns="sample_A")
# Open an existing store
store = pbzarr.open("sample.pbz.zarr")
track = store["depths"]

# Slice-based access
data = track["chr1", 0:1000, "sample_A"]

# Dask backend for lazy/parallel computation
store = pbzarr.open("sample.pbz.zarr", backend="dask")
lazy = store["depths"].query("chr1:0-1000000")
result = lazy.compute()

Features

  • Zarr v3 only with full codec and storage backend support
  • NumPy and Dask backends for eager or lazy computation
  • Region query syntax: "chr1:1000-2000", tuples, or slice notation
  • Column filtering by name or index
  • Escape hatches to raw zarr.Group and zarr.Array objects
  • Self-describing tracks with independent dtype, chunking, and metadata

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pbzarr-0.1.0.tar.gz (42.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pbzarr-0.1.0-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file pbzarr-0.1.0.tar.gz.

File metadata

  • Download URL: pbzarr-0.1.0.tar.gz
  • Upload date:
  • Size: 42.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pbzarr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1043117a50ec679399537e2c375c73fc8dc402d7349dec26cf7cf11b61b0e5aa
MD5 c440ee160bd9c4492b80c4ff1908b43e
BLAKE2b-256 8738a7ebb92b61456896068e57fe2c7949ddae7748b1d7a8b8993da6de150a90

See more details on using hashes here.

Provenance

The following attestation bundles were made for pbzarr-0.1.0.tar.gz:

Publisher: release.yml on pbzarr/pbzarr-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pbzarr-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pbzarr-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pbzarr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ab10219f036bd2238dd6911b98028b9ba568240401c9f382d05070f4c6ae368a
MD5 8c83ecb19d1a34190f5447442d8755b1
BLAKE2b-256 96e3f94e09ef14fad0a36924924c84a565f5ee91c6d8c5dc5a1de9abe4e2f229

See more details on using hashes here.

Provenance

The following attestation bundles were made for pbzarr-0.1.0-py3-none-any.whl:

Publisher: release.yml on pbzarr/pbzarr-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page