Skip to main content

Convenience wrappers around numpy histograms

Project description

https://travis-ci.org/JelleAalbers/multihist.svg?branch=master

https://github.com/JelleAalbers/multihist

Thin wrapper around numpy’s histogram and histogramdd.

Numpy has great histogram functions, which return (histogram, bin_edges) tuples. This package wraps these in a class with methods for adding new data to existing histograms, take averages, projecting, etc.

For 1-dimensional histograms you can access cumulative and density information, as well as basic statistics (mean and std). For d-dimensional histograms you can name the axes, and refer to them by their names when projecting / summing / averaging.

NB: The Scikit-HEP project (especially Henry Schreiner and Hans Dembinski) created a very cool library called boost-histogram. This is faster, more fully featured, and much more robust than multihist. It also has a numpy compatibility layer at boost_histogram.numpy. If you are starting a new project, I would recommend looking into boost-histogram rather than multihist.

Synopsis:

# Create histograms just like from numpy...
m = Hist1d([0, 3, 1, 6, 2, 9], bins=3)

# ...or add data incrementally:
m = Hist1d(bins=100, range=(-3, 4))
m.add(np.random.normal(0, 0.5, 10**4))
m.add(np.random.normal(2, 0.2, 10**3))

# Get the data back out:
print(m.histogram, m.bin_edges)

# Access derived quantities like bin_centers, normalized_histogram, density, cumulative_density, mean, std
plt.plot(m.bin_centers, m.normalized_histogram, label="Normalized histogram", linestyle='steps')
plt.plot(m.bin_centers, m.density, label="Empirical PDF", linestyle='steps')
plt.plot(m.bin_centers, m.cumulative_density, label="Empirical CDF", linestyle='steps')
plt.title("Estimated mean %0.2f, estimated std %0.2f" % (m.mean, m.std))
plt.legend(loc='best')
plt.show()

# Slicing and arithmetic behave just like ordinary ndarrays
print("The fourth bin has %d entries" % m[3])
m[1:4] += 4 + 2 * m[-27:-24]
print("Now it has %d entries" % m[3])

# Of course I couldn't resist adding a canned plotting function:
m.plot()
plt.show()

# Create and show a 2d histogram. Axis names are optional.
m2 = Histdd(bins=100, range=[[-5, 3], [-3, 5]], axis_names=['x', 'y'])
m2.add(np.random.normal(1, 1, 10**6), np.random.normal(1, 1, 10**6))
m2.add(np.random.normal(-2, 1, 10**6), np.random.normal(2, 1, 10**6))
m2.plot()
plt.show()

# x and y projections return Hist1d objects
m2.projection('x').plot(label='x projection')
m2.projection(1).plot(label='y projection')
plt.legend()
plt.show()

History

0.6.4 (2021-01-17)

  • Prevent object array creation (#12)

0.6.3 (2020-01-22)

  • Feldman-Cousins errors for Hist1d.plot (#10)

0.6.2 (2020-01-15)

  • Fix rebinning for empty histograms (#9)

0.6.1 (2019-12-05)

  • Fixes for #7 (#8)

0.6.0 (2019-06-30)

  • Correct step plotting at edges, other plotting fixes
  • Histogram numpy structured arrays
  • Fix deprecation warnings (#6)
  • lookup_hist
  • .max() and .min() methods
  • percentile support for higher-dimensional histograms
  • Improve Hist1d.get_random (also randomize in bin)

0.5.4 (2017-09-20)

  • Fix issue with input from dask

0.5.3 (2017-09-18)

  • Fix python 2 support

0.5.2 (2017-08-08)

  • Fix colorbar arguments to Histdd.plot (#4)
  • percentile for Hist1d
  • rebin method for Histdd (experimental)

0.5.1 (2017-03-22)

  • get_random for Histdd no longer just returns bin centers (Hist1d does stil…)
  • lookup for Hist1d. When will I finally merge the classes…

0.5.0 (2016-10-07)

  • pandas.DataFrame and dask.dataframe support
  • dimensions option to Histdd to init axis_names and bin_centers at once

0.4.3 (2016-10-03)

  • Remove matplotlib requirement (still required for plotting features)

0.4.2 (2016-08-10)

  • Fix small bug for >=3 d histograms

0.4.1 (2016-17-14)

  • get_random and lookup for Histdd. Not really tested yet.

0.4.0 (2016-02-05)

  • .std function for Histdd
  • Fix off-by-one errors

0.3.0 (2015-09-28)

  • Several new histdd functions: cumulate, normalize, percentile…
  • Python 2 compatibility

0.2.1 (2015-08-18)

  • Histdd functions sum, slice, average now also work

0.2 (2015-08-06)

  • Multidimensional histograms
  • Axes naming

0.1.1-4 (2015-08-04)

Correct various rookie mistakes in packaging… Hey, it’s my first pypi package!

0.1 (2015-08-04)

Initial release

  • Hist1d, Hist2d
  • Basic test suite
  • Basic readme

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for multihist, version 0.6.4
Filename, size File type Python version Upload date Hashes
Filename, size multihist-0.6.4.tar.gz (16.8 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page