Skip to main content

Histogram data in bins but there are 2**64 (almost un-bound) bins.

Project description

TestStatus PyPiStatus BlackStyle BlackPackStyle MITLicenseBadge

This UnBoundhHistogram has bins with a fixed width. It is sparse and thus does not allocate memory for bins with zero content. It’s range is almost un-bound (integer limits). Bins are allocated and populated as needed during assignment. Making a histogram in an almost un bound range is usefule when one does not know the range of the data in advance and when streaming thrhough the data is costly. UnBoundhHistogram was created to histogram vast streams of data generated in costly simulations for particle physics. Buzz word bingo: big data.

Install

pip install un_bound_histogram

Usage

import un_bound_histogram
import numpy

prng = numpy.random.Generator(numpy.random.PCG64(1337))

h = un_bound_histogram.UnBoundHistogram(bin_width=0.1)

h.assign(x=prng.normal(loc=5.0, scale=2.0, size=1000000))

# assign multiple times to grow the histogram.
h.assign(x=prng.normal(loc=-3.0, scale=1.0, size=1000000))
h.assign(x=prng.normal(loc=1.0, scale=0.5, size=1000000))

assert 0.9 < h.percentile(50) < 1.1
assert h.sum() == 3 * 1000000

The UnBoundHistogram has a few statistical estimators built in, such as modus() and quantile()/percentile().

There is also a two dimensional implementation UnBoundHistogram2d. See tests for examples.

import un_bound_histogram
import numpy as np

prng = np.random.Generator(np.random.PCG64(9))
SIZE = 100000
XLOC = 3.0
YLOC = -4.5

ubh = un_bound_histogram.UnBoundHistogram2d(
    x_bin_width=0.1,
    y_bin_width=0.1,
)

ubh.assign(
    x=prng.normal(loc=XLOC, scale=1.0, size=SIZE),
    y=prng.normal(loc=YLOC, scale=1.0, size=SIZE),
)

xb_max, yb_max = ubh.argmax()
x_max = xb_max * ubh.x_bin_width
y_max = yb_max * ubh.y_bin_width

assert XLOC - 0.5 < x_max < XLOC + 0.5
assert YLOC - 0.5 < y_max < YLOC + 0.5

x_range, y_range = ubh.range()

assert x_range[0] <= xb_max <= x_range[1]
assert y_range[0] <= yb_max <= y_range[1]

assert ubh.sum() == SIZE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

un_bound_histogram-0.1.0.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

un_bound_histogram-0.1.0-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file un_bound_histogram-0.1.0.tar.gz.

File metadata

  • Download URL: un_bound_histogram-0.1.0.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for un_bound_histogram-0.1.0.tar.gz
Algorithm Hash digest
SHA256 813961d5b7e56965beff5367950b8e81f5b2e8b3de9bd735241180c3bf2af923
MD5 7b19854c8a8695016dc89ed7a2200422
BLAKE2b-256 b9be6e4067df966df51c9bf4227cc428a0e64f0b1e4dec8afd641327940d8b2e

See more details on using hashes here.

File details

Details for the file un_bound_histogram-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for un_bound_histogram-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4f16cd42987d320ef19ca102098bd61e7f738ed7354759fb019bf5dbc6a0cc0b
MD5 82d42434cdcea69bda79f075c4ea699e
BLAKE2b-256 8745e8da213e792c80cd62fb2c96986475ac3890cbdbf7a189dcd9be2471f735

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page