Skip to main content

Convenience wrappers around numpy histograms

Project description

https://github.com/JelleAalbers/multihist/actions/workflows/tests.yml/badge.svg

https://github.com/JelleAalbers/multihist

Thin wrapper around numpy’s histogram and histogramdd.

Numpy has great histogram functions, which return (histogram, bin_edges) tuples. This package wraps these in a class with methods for adding new data to existing histograms, take averages, projecting, etc.

For 1-dimensional histograms you can access cumulative and density information, as well as basic statistics (mean and std). For d-dimensional histograms you can name the axes, and refer to them by their names when projecting / summing / averaging.

NB: For a faster and richer histogram package, check out hist from scikit-hep. Alternatively, look at its parent library boost-histogram, which has numpy-compatible features. Multihist was created back in 2015, long before those libraries existed.

Synopsis:

# Create histograms just like from numpy...
m = Hist1d([0, 3, 1, 6, 2, 9], bins=3)

# ...or add data incrementally:
m = Hist1d(bins=100, range=(-3, 4))
m.add(np.random.normal(0, 0.5, 10**4))
m.add(np.random.normal(2, 0.2, 10**3))

# Get the data back out:
print(m.histogram, m.bin_edges)

# Access derived quantities like bin_centers, normalized_histogram, density, cumulative_density, mean, std
plt.plot(m.bin_centers, m.normalized_histogram, label="Normalized histogram", drawstyle='steps')
plt.plot(m.bin_centers, m.density, label="Empirical PDF", drawstyle='steps')
plt.plot(m.bin_centers, m.cumulative_density, label="Empirical CDF", drawstyle='steps')
plt.title("Estimated mean %0.2f, estimated std %0.2f" % (m.mean, m.std))
plt.legend(loc='best')
plt.show()

# Slicing and arithmetic behave just like ordinary ndarrays
print("The fourth bin has %d entries" % m[3])
m[1:4] += 4 + 2 * m[-27:-24]
print("Now it has %d entries" % m[3])

# Of course I couldn't resist adding a canned plotting function:
m.plot()
plt.show()

# Create and show a 2d histogram. Axis names are optional.
m2 = Histdd(bins=100, range=[[-5, 3], [-3, 5]], axis_names=['x', 'y'])
m2.add(np.random.normal(1, 1, 10**6), np.random.normal(1, 1, 10**6))
m2.add(np.random.normal(-2, 1, 10**6), np.random.normal(2, 1, 10**6))
m2.plot()
plt.show()

# x and y projections return Hist1d objects
m2.projection('x').plot(label='x projection')
m2.projection(1).plot(label='y projection')
plt.legend()
plt.show()

History

0.6.5 (2022-01-26)

  • ‘model’ option for error bars, showing Poisson quantiles (#14)

  • Fix vmin/vmax for matplotlib >3.3, resume CI tests (#15)

  • Hist1d.data_for_plot returns numbers used in error calculation

0.6.4 (2021-01-17)

  • Prevent object array creation (#12)

0.6.3 (2020-01-22)

  • Feldman-Cousins errors for Hist1d.plot (#10)

0.6.2 (2020-01-15)

  • Fix rebinning for empty histograms (#9)

0.6.1 (2019-12-05)

  • Fixes for #7 (#8)

0.6.0 (2019-06-30)

  • Correct step plotting at edges, other plotting fixes

  • Histogram numpy structured arrays

  • Fix deprecation warnings (#6)

  • lookup_hist

  • .max() and .min() methods

  • percentile support for higher-dimensional histograms

  • Improve Hist1d.get_random (also randomize in bin)

0.5.4 (2017-09-20)

  • Fix issue with input from dask

0.5.3 (2017-09-18)

  • Fix python 2 support

0.5.2 (2017-08-08)

  • Fix colorbar arguments to Histdd.plot (#4)

  • percentile for Hist1d

  • rebin method for Histdd (experimental)

0.5.1 (2017-03-22)

  • get_random for Histdd no longer just returns bin centers (Hist1d does stil…)

  • lookup for Hist1d. When will I finally merge the classes…

0.5.0 (2016-10-07)

  • pandas.DataFrame and dask.dataframe support

  • dimensions option to Histdd to init axis_names and bin_centers at once

0.4.3 (2016-10-03)

  • Remove matplotlib requirement (still required for plotting features)

0.4.2 (2016-08-10)

  • Fix small bug for >=3 d histograms

0.4.1 (2016-17-14)

  • get_random and lookup for Histdd. Not really tested yet.

0.4.0 (2016-02-05)

  • .std function for Histdd

  • Fix off-by-one errors

0.3.0 (2015-09-28)

  • Several new histdd functions: cumulate, normalize, percentile…

  • Python 2 compatibility

0.2.1 (2015-08-18)

  • Histdd functions sum, slice, average now also work

0.2 (2015-08-06)

  • Multidimensional histograms

  • Axes naming

0.1.1-4 (2015-08-04)

Correct various rookie mistakes in packaging… Hey, it’s my first pypi package!

0.1 (2015-08-04)

Initial release

  • Hist1d, Hist2d

  • Basic test suite

  • Basic readme

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multihist-0.6.5.tar.gz (16.7 kB view hashes)

Uploaded Source

Built Distribution

multihist-0.6.5-py3-none-any.whl (14.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page