Skip to main content

The Boost::Histogram Python wrapper.

Project description

boost-histogram for Python

Gitter Build Status Documentation Status DOI Code style: black

Python bindings for Boost::Histogram (source), a C++14 library. This should become one of the fastest libraries for histogramming, while still providing the power of a full histogram object.

Version 0.5.2: Public beta

Please feel free to try out boost-histogram and give feedback. Join the discussion on gitter or open an issue!

Installation

You can install this library from PyPI with pip:

python -m pip install boost-histogram

All the normal best-practices for Python apply; you should be in a virtual environment, otherwise add --user, etc.

Conda support is planned.

Usage

import boost_histogram as bh

# Compose axis however you like; this is a 2D histogram
hist = bh.histogram(bh.axis.regular(2, 0, 1),
                    bh.axis.regular(4, 0.0, 1.0))

# Filling can be done with arrays, one per dimension
hist.fill([.3, .5, .2],
          [.1, .4, .9])

# Numpy array view into histogram counts, no overflow bins
counts = hist.view()

Features

  • Many axis types (all support metadata=...)

    • bh.axis.regular(n, start, stop, underflow=True, overflow=True, growth=False): shortcut to make the types below. flow=False is also supported.
    • bh.axis.circular(n, start, stop): Value outside the range wrap into the range
    • bh.axis.regular_log(n, start, stop): Regularly spaced values in log 10 scale
    • bh.axis.regular_sqrt(n, start, stop): Regularly spaced value in sqrt scale
    • bh.axis.regular_pow(n, start, stop, power): Regularly spaced value to some power
    • bh.axis.integer(start, stop, underflow=True, overflow=True, growth=False): Special high-speed version of regular for evenly spaced bins of width 1
    • bh.axis.variable([start, edge1, edge2, ..., stop], underflow=True, overflow=True): Uneven bin spacing
    • bh.axis.category([...], growth=False): Integer or string categories
  • Axis features:

    • .index(values): The index at a point (or points) on the axis
    • .value(indexes): The value for a fractional bin in the axis
    • .bin(i): The bin edges or a bin value (categories)
    • .centers: The N bin centers (if continuous)
    • .edges: The N+1 bin edges (if continuous)
    • .extent: The number of bins (including under/overflow)
    • .metadata: Anything a user wants to store
    • .options: The options set on the axis (bh.axis.options)
    • .size: The number of bins (not including under/overflow)
    • .widths: The N bin widths
  • Many storage types

    • bh.storage.double: Doubles for weighted values (default)
    • bh.storage.int: 64 bit unsigned integers
    • bh.storage.unlimited: Starts small, but can go up to unlimited precision ints or doubles.
    • bh.storage.atomic_int: Threadsafe filling, experimental. Does not support growing axis in threads. (.view not yet supported`)
    • bh.storage.weight: Stores a weight and sum of weights squared. (.view not yet supported)
    • bh.storage.mean: Accepts a sample and computes the mean of the samples (profile). (.view not yet supported)
    • bh.storage.weighted_mean: Accepts a sample and a weight. It computes the weighted mean of the samples. (.view not yet supported)
  • Accumulators

    • bh.accumulator.sum: High accuracy sum (Neumaier) - used by the sum method when summing a numerical histogram
    • bh.accumulator.weighted_sum: Tracks a weighted sum and variance
    • bh.accumulator.weighted_mean: Tracks a weighted sum, mean, and variance (West's incremental algorithm)
    • bh.accumulator.mean: Running count, mean, and variance (Welfords's incremental algorithm)
  • Histogram operations

    • h.fill(arr, ..., weight=...) Fill with N arrays or single values
    • h.rank: The number of dimensions
    • h.size or len(h): The number of bins
    • .reset(): Set counters to 0
    • +: Add two histograms
    • *=: Multiply by a scaler (not all storages) (hist * scalar and scalar * hist supported too)
    • /=: Divide by a scaler (not all storages) (hist / scalar supported too)
    • .to_numpy(flow=False): Convert to a numpy style tuple (with or without under/overflow bins)
    • .view(flow=False): Get a view on the bin contents (with or without under/overflow bins)
    • .axes: Get the axes
      • .axes[0]: Get the 0th axis
      • .axes.edges: The lower values as a broadcasting-ready array
      • All other properties of axes available here, too
    • .sum(flow=False): The total count of all bins
    • .project(ax1, ax2, ...): Project down to listed axis (numbers)
    • .reduce(ax, reduce_option, ...): shrink, rebin, or slice, or any combination
  • Indexing - Supports the Unified Histogram Indexing (UHI) proposal

  • Details

    • Use bh.histogram(..., storage=...) to make a histogram (there are several different types)

Supported platforms

Binaries available:

The easiest way to get boost-histogram is to use a binary wheel. These are the supported platforms for which wheels are produced:

System Arch Python versions
ManyLinux1 (custom GCC 9.2) 64 & 32-bit 2.7, 3.5, 3.6, 3.7
ManyLinux2010 64-bit 2.7, 3.5, 3.6, 3.7, 3.8
macOS 10.9+ 64-bit 2.7, 3.6, 3.7, 3.8
Windows 64 & 32-bit 2.7, 3.6, 3.7
  • Linux: I'm not supporting 3.4 because I have to build the Numpy wheels to do so.
  • manylinux1: Using a custom docker container with GCC 9.2; should work but can't be called directly other compiled extensions unless they do the same thing (think that's the main caveat). Supporting 32 bits because it's there. Numpy does not build correctly with Python 3.8, GCC 9.2, and manylinux1, so Python 3.8 is not supported; use manylinux2010 instead.
  • manylinux2010: Requires pip 10+ and a version of Linux newer than 2010. This is very new technology.
  • MacOS: Uses the dedicated 64 bit 10.9+ Python.org builds. We are not supporting 3.5 because those no longer provide binaries (could add a 32+64 fat 10.6+ that really was 10.9+, but not worth it unless there is a need for it).
  • Windows: PyBind11 requires compilation with a newer copy of Visual Studio than Python 2.7's Visual Studio 2008; you need to have the Visual Studio 2015 distributable installed (the dll is included in 2017 and 2019, as well). Wheels are not provided for 3.8, waiting on support from Azure.

If you are on a Linux system that is not part of the "many" in manylinux, such as Alpine or ClearLinux, building from source is usually fine, since the compilers on those systems are often quite new. It will just take a little longer to install when it's using the sdist instead of a wheel.

Source builds

For a source build, for example from an "sdist" package, the only requirements are a C++14 compatible compiler. The compiler requirements are dictated by Boost.Histogram's C++ requirements: gcc >= 5.5, clang >= 3.8, msvc >= 14.1.

If you are using Python 2.7 on Windows, you will need to use a recent version of Visual studio and force distutils to use it, or just upgrade to Python 3.6 or newer. Check the PyBind11 documentation for more help. On some Linux systems, you may need to use a newer compiler than the one your distribution ships with.

Having Numpy before building is recommended (enables multithreaded builds). Boost 1.71 is not required or needed (this only depends on included header-only dependencies).This library is under active development; you can install directly from GitHub if you would like.

python -m pip install git+https://github.com/scikit-hep/boost-histogram.git@develop

For the moment, you need to uninstall and reinstall to ensure you have the latest version - pip will not rebuild if it thinks the version number has not changed. In the future, this may be addressed differently in boost-histogram.

Developing

See CONTRIBUTING.md for details on how to set up a development environment.

Talks and other documentation/tutorial sources


Acknowledgements

Support for this work was provided by the National Science Foundation cooperative agreement OAC-1836650 (IRIS-HEP) and OAC-1450377 (DIANA/HEP). Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

boost-histogram-0.5.2.tar.gz (466.5 kB view hashes)

Uploaded Source

Built Distributions

boost_histogram-0.5.2-cp38-cp38-win_amd64.whl (580.9 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

boost_histogram-0.5.2-cp38-cp38-win32.whl (443.4 kB view hashes)

Uploaded CPython 3.8 Windows x86

boost_histogram-0.5.2-cp38-cp38-manylinux2010_x86_64.whl (26.0 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

boost_histogram-0.5.2-cp38-cp38-manylinux1_x86_64.whl (22.3 MB view hashes)

Uploaded CPython 3.8

boost_histogram-0.5.2-cp38-cp38-manylinux1_i686.whl (21.6 MB view hashes)

Uploaded CPython 3.8

boost_histogram-0.5.2-cp38-cp38-macosx_10_9_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

boost_histogram-0.5.2-cp37-cp37m-win_amd64.whl (585.2 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

boost_histogram-0.5.2-cp37-cp37m-win32.whl (446.8 kB view hashes)

Uploaded CPython 3.7m Windows x86

boost_histogram-0.5.2-cp37-cp37m-manylinux2010_x86_64.whl (26.6 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

boost_histogram-0.5.2-cp37-cp37m-manylinux1_x86_64.whl (22.8 MB view hashes)

Uploaded CPython 3.7m

boost_histogram-0.5.2-cp37-cp37m-manylinux1_i686.whl (22.0 MB view hashes)

Uploaded CPython 3.7m

boost_histogram-0.5.2-cp37-cp37m-macosx_10_9_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.7m macOS 10.9+ x86-64

boost_histogram-0.5.2-cp36-cp36m-win_amd64.whl (585.2 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

boost_histogram-0.5.2-cp36-cp36m-win32.whl (446.3 kB view hashes)

Uploaded CPython 3.6m Windows x86

boost_histogram-0.5.2-cp36-cp36m-manylinux2010_x86_64.whl (26.6 MB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

boost_histogram-0.5.2-cp36-cp36m-manylinux1_x86_64.whl (22.8 MB view hashes)

Uploaded CPython 3.6m

boost_histogram-0.5.2-cp36-cp36m-manylinux1_i686.whl (22.0 MB view hashes)

Uploaded CPython 3.6m

boost_histogram-0.5.2-cp36-cp36m-macosx_10_9_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.6m macOS 10.9+ x86-64

boost_histogram-0.5.2-cp35-cp35m-manylinux2010_x86_64.whl (26.6 MB view hashes)

Uploaded CPython 3.5m manylinux: glibc 2.12+ x86-64

boost_histogram-0.5.2-cp35-cp35m-manylinux1_x86_64.whl (22.8 MB view hashes)

Uploaded CPython 3.5m

boost_histogram-0.5.2-cp35-cp35m-manylinux1_i686.whl (22.0 MB view hashes)

Uploaded CPython 3.5m

boost_histogram-0.5.2-cp27-cp27mu-manylinux2010_x86_64.whl (26.0 MB view hashes)

Uploaded CPython 2.7mu manylinux: glibc 2.12+ x86-64

boost_histogram-0.5.2-cp27-cp27mu-manylinux1_x86_64.whl (22.2 MB view hashes)

Uploaded CPython 2.7mu

boost_histogram-0.5.2-cp27-cp27mu-manylinux1_i686.whl (21.5 MB view hashes)

Uploaded CPython 2.7mu

boost_histogram-0.5.2-cp27-cp27m-win_amd64.whl (639.6 kB view hashes)

Uploaded CPython 2.7m Windows x86-64

boost_histogram-0.5.2-cp27-cp27m-win32.whl (470.5 kB view hashes)

Uploaded CPython 2.7m Windows x86

boost_histogram-0.5.2-cp27-cp27m-manylinux2010_x86_64.whl (26.0 MB view hashes)

Uploaded CPython 2.7m manylinux: glibc 2.12+ x86-64

boost_histogram-0.5.2-cp27-cp27m-manylinux1_x86_64.whl (22.2 MB view hashes)

Uploaded CPython 2.7m

boost_histogram-0.5.2-cp27-cp27m-manylinux1_i686.whl (21.5 MB view hashes)

Uploaded CPython 2.7m

boost_histogram-0.5.2-cp27-cp27m-macosx_10_9_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 2.7m macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page