Python implementation of the iopsystems h2 histogram, interoperable with Rezolus

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

thinkingfish

These details have not been verified by PyPI

Project links

Project description

h2histogram-py

A pure-Python implementation of the iopsystems h2 histogram.

h2histogram produces histograms with byte-for-byte identical bucketing to the Rust histogram crate, so histograms recorded here can be consumed by Rezolus — and, conversely, you can open a Parquet/Arrow column of h2histogram values produced by Rezolus and analyze it in Python.

What is an h2 histogram?

An h2 histogram quantizes values into buckets using two parameters:

grouping_power — the number of buckets spanning each power of two. It sets the relative error to 2^-grouping_power (e.g. grouping_power=7 → ~0.78% error).
max_value_power — the largest representable value is 2^max_value_power - 1.

Values below 2^(grouping_power+1) are stored exactly (linear buckets of width 1); larger values fall into logarithmic buckets. This gives HDR-histogram-like guarantees with a simpler, faster bucket index computation. Rezolus records histograms with grouping_power=3 and max_value_power=64.

Install

pip install h2histogram            # core library (no dependencies)
pip install h2histogram[parquet]   # + pyarrow, for the Arrow/Parquet interop
pip install h2histogram[numpy]     # + numpy, for a vectorized bulk-record fast path

For local development from a checkout:

pip install -e ".[dev]"
pytest

Quick start

from h2histogram import Histogram

h = Histogram(grouping_power=7, max_value_power=64)
h.increment(42)
h.record(1000, count=5)
h.record_many([12, 15, 900, 1_000_000])   # bulk (uses numpy if available)

print(h.total_count())          # 8
p99 = h.percentile(0.99)        # a Bucket
print(p99.range, p99.midpoint)  # ((..lo.., ..hi..), midpoint estimate)

# Combine / reduce
merged = h.merge(other_h)       # element-wise sum (also: h + other_h)
coarse = h.downsample(4)        # fewer buckets, higher error, same total count
sparse = h.to_sparse()          # columnar (index, count) form for storage

Fast repeated quantile queries

For a snapshot you'll query many times, convert to a CumulativeHistogram (the crate's CumulativeROHistogram). It stores non-zero buckets with cumulative counts, so percentiles are answered with a binary search, and it precomputes a midpoint-estimated mean:

c = h.to_cumulative()           # read-only; also SparseHistogram.to_cumulative()
c.percentile(0.99)              # O(log n) binary search -> Bucket (individual count)
c.mean()                        # midpoint-estimated mean, computed once
c.bucket_quantile_range(0)      # (lower, upper) quantile fraction of a stored bucket
for bucket, lo, hi in c.iter_with_quantiles():
    ...                         # each non-zero bucket with its quantile span

Reading histograms from a Rezolus Parquet file

Rezolus writes one row per sample interval. Histogram metrics are stored as a dense "{metric}:buckets" column, or a sparse "{metric}:bucket_indices" / "{metric}:bucket_counts" pair — all List<UInt64>.

from h2histogram.arrow import histogram_columns, read_histograms

# Discover histogram metrics in the file
for col in histogram_columns("rezolus.parquet"):
    print(col.name, col.kind)   # e.g. "syscall/read/latency standard"

# Read a metric's time series: one Histogram per row (None for missing rows)
series = read_histograms("rezolus.parquet", "syscall/read/latency")

for i, hist in enumerate(series):
    if hist is not None:
        print(i, hist.percentile(0.99).midpoint)

# Aggregate the whole recording
total = series[0]
for hist in series[1:]:
    if hist is not None:
        total = total.merge(hist)
print("overall p99:", total.percentile(0.99).midpoint)

The bucketing config is resolved from (in order): an explicit config=/grouping_power= argument, grouping_power/max_value_power recorded in the field metadata, inference from a dense column's bucket count, and finally the Rezolus defaults (grouping_power=3, max_value_power=64).

Writing a Rezolus-compatible file

from h2histogram.arrow import write_histograms

write_histograms(
    "out.parquet",
    {"syscall/read/latency": series},   # {metric_name: [Histogram, ...]}
    timestamps=timestamps_ns,           # optional; one per row
    histogram_type="standard",          # or "sparse"
)

Files written this way match the metriken/Rezolus column layout and additionally record grouping_power/max_value_power in the field metadata so they are fully self-describing on read.

See the runnable examples in examples/:

basic_usage.py — record and query percentiles
read_rezolus_parquet.py — open a Parquet column of h2histogram values (synthesizes a sample file if you don't pass one)

API overview

Type	Purpose
`Config`	Bucketing parameters; `value_to_index`, `index_to_range`, `total_buckets`, `error`
`Histogram`	Dense histogram; `increment`, `record`, `record_many`, `percentile(s)`, `merge`, `subtract`, `downsample`, `to_sparse`, `to_cumulative`, `from_buckets`
`SparseHistogram`	Columnar `(index, count)` form; `from_histogram`, `from_parts`, `to_dense`, `to_cumulative`
`CumulativeHistogram`	Read-only cumulative form (crate's `CumulativeROHistogram`); binary-search `percentile(s)`, `mean`, `bucket_quantile_range`, `iter_with_quantiles`
`Bucket`	A bucket's `count` and inclusive `[start, end]` range, plus `midpoint`/`width`
`h2histogram.arrow`	Read/write the Rezolus Arrow/Parquet layout

Correctness

The bucketing math is verified against the exact assertions from the Rust crate's own unit tests (src/config.rs), and the NumPy bulk-record fast path is checked against the scalar path across the full u64 range. Run pytest to see for yourself.

License

MIT — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

thinkingfish

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jul 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

h2histogram-0.1.0.tar.gz (23.9 kB view details)

Uploaded Jul 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

h2histogram-0.1.0-py3-none-any.whl (20.6 kB view details)

Uploaded Jul 3, 2026 Python 3

File details

Details for the file h2histogram-0.1.0.tar.gz.

File metadata

Download URL: h2histogram-0.1.0.tar.gz
Upload date: Jul 3, 2026
Size: 23.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for h2histogram-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`537f517523f41656f93e7ac887d62812080b77c6e2712c44db0dc8fbffc06b70`
MD5	`23ceb9c139bcb42ee801bd3cba7b6c40`
BLAKE2b-256	`48cc7faecd6fc466ea56207c907cda20ba340c83fc77ddb993e08d7b9a382fa1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for h2histogram-0.1.0.tar.gz:

Publisher: release.yml on iopsystems/h2histogram-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: h2histogram-0.1.0.tar.gz
- Subject digest: 537f517523f41656f93e7ac887d62812080b77c6e2712c44db0dc8fbffc06b70
- Sigstore transparency entry: 2055950567
- Sigstore integration time: Jul 3, 2026
Source repository:
- Permalink: iopsystems/h2histogram-py@1700ee6968d33b49fc28db44f8169681161a3b7a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/iopsystems
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@1700ee6968d33b49fc28db44f8169681161a3b7a
- Trigger Event: release

File details

Details for the file h2histogram-0.1.0-py3-none-any.whl.

File metadata

Download URL: h2histogram-0.1.0-py3-none-any.whl
Upload date: Jul 3, 2026
Size: 20.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for h2histogram-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6ee03444fc8015b5bfd20ebdb3eb0b3652dabefd918520edbf779c1340483201`
MD5	`9aa38f911b73e6e3429e617b8086f781`
BLAKE2b-256	`e6c7099467017b646361a75e36d2b1cfd29e35f15e15dcf2ba3797706ec1148a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for h2histogram-0.1.0-py3-none-any.whl:

Publisher: release.yml on iopsystems/h2histogram-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: h2histogram-0.1.0-py3-none-any.whl
- Subject digest: 6ee03444fc8015b5bfd20ebdb3eb0b3652dabefd918520edbf779c1340483201
- Sigstore transparency entry: 2055950611
- Sigstore integration time: Jul 3, 2026
Source repository:
- Permalink: iopsystems/h2histogram-py@1700ee6968d33b49fc28db44f8169681161a3b7a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/iopsystems
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@1700ee6968d33b49fc28db44f8169681161a3b7a
- Trigger Event: release

h2histogram 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

h2histogram-py

What is an h2 histogram?

Install

Quick start

Fast repeated quantile queries

Reading histograms from a Rezolus Parquet file

Writing a Rezolus-compatible file

API overview

Correctness

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance