Skip to main content

PowerBin: Fast Adaptive Data Binning with Centroidal Power Diagrams

Project description

The PowerBin Package

PowerBin: Fast Adaptive Data Binning with Centroidal Power Diagrams

https://users.physics.ox.ac.uk/~cappellari/images/powerbin-logo.svg https://img.shields.io/pypi/v/powerbin.svg https://img.shields.io/badge/arXiv-2509.06903-orange.svg https://img.shields.io/badge/DOI-10.1093/mnras/staf1726-green.svg

This PowerBin package provides a Python implementation of the PowerBin algorithm — a modern alternative to the classic Voronoi binning method. Like Voronoi binning, it performs 2D adaptive spatial binning to achieve a nearly constant value per bin of a chosen capacity (e.g., signal‑to‑noise ratio or any other user‑defined function of the bin spaxels).

Key advances over the classic method include:

  • Centroidal Power Diagram: Produces bins that are nearly round, convex, and connected, and eliminates the disconnected or nested bins that could occur with earlier approaches.

  • Scalability: The entire algorithm scales with O(N log N) complexity, removing the O(N^2) bottleneck previously present in both the bin-accretion and regularization steps. This makes processing million‑pixel datasets practical.

  • Stable CPD construction: Generates the tessellation via a heuristic inspired by packed soap bubbles, avoiding the numerical fragility of formal CPD solvers with realistic non-additive capacities (e.g., correlated noise).

The algorithm combines a fast initial bin-accretion phase with iterative regularization, and is described in detail in Cappellari (2025).

Attribution

If you use this software for your research, please cite Cappellari (2025). The BibTeX entry for the paper is:

@Article{Cappellari2025,
    author   = {Cappellari, Michele},
    journal  = {MNRAS},
    title    = {PowerBin: fast adaptive data binning with Centroidal Power Diagrams},
    year     = {2025},
    month    = dec,
    number   = {2},
    pages    = {1432--1446},
    volume   = {544},
    doi      = {10.1093/mnras/staf1726},
    url      = {https://ui.adsabs.harvard.edu/abs/2025MNRAS.544.1432C},
}

Installation

install with:

pip install powerbin

Without write access to the global site-packages directory, use:

pip install --user powerbin

To upgrade PowerBin to the latest version use:

pip install --upgrade powerbin

Usage Examples

To learn how to use the PowerBin package, copy, modify and run the example programs in the powerbin/examples directory. It can be found within the main powerbin package installation folder inside site-packages. The detailed documentation is contained in the docstring of the file powerbin/powerbin.py, or on PyPi.

Minimal example

Below is a minimal, runnable example. It demonstrates how to use PowerBin and highlights the two ways to specify the bin capacity. In this example, we define the capacity as (S/N)^2, so the target capacity is set to target_sn**2.

The capacity can be specified in two forms:

  1. As an array (by setting additive=True): This is the simplest approach, recommended when the capacity is additive (e.g., when noise is Poissonian, the total (S/N)^2 is the sum of the individual pixel values). For very large datasets (millions of pixels), this method is also significantly faster. For small or moderate datasets, the speed difference is negligible.

  2. As a function (by setting additive=False): This provides maximum flexibility for complex, non-additive capacity definitions. The S/N for a bin is calculated as np.sum(signal) / np.sqrt(np.sum(noise**2)). This is the standard formula for uncorrelated noise and should be the default choice for most applications. One would only modify this for special cases, such as:

    • To account for known covariance in the noise between pixels. The example code shows a commented-out empirical correction, but this is data-dependent and not a general prescription.

    • To perform complex calculations on each bin (e.g., fitting a model to extract kinematics), which is a possible but advanced use case.

from importlib import resources
import numpy as np
import matplotlib.pyplot as plt
from powerbin import PowerBin

# Load example data: x, y, signal, noise
data_path = resources.files('powerbin') / 'examples/sample_data_ngc2273.txt'
x, y, signal, noise = np.loadtxt(data_path).T
xy = np.column_stack([x, y])

target_sn = 50

# --- Define Capacity Specification ---
# Toggle this flag to switch between the two methods.
additive = False

if additive:
    # 1. Additive case: Provide a pre-calculated array of pixel capacities.
    # This is efficient for capacities like (S/N)^2 with Poissonian noise.
    capacity_spec = (signal / noise)**2

else:
    # 2. Non-additive case: Provide a function for custom capacity logic.
    def capacity_spec(index):
        """Calculates (S/N)^2 for a bin from its pixel indices."""
        # Standard S/N formula for uncorrelated noise
        sn = np.sum(signal[index]) / np.sqrt(np.sum(noise[index]**2))
        # Example for correlated noise (see full example file for details):
        # sn /= 1 + 1.07 * np.log10(len(index))
        return sn**2

# Perform the binning. The target is target_sn**2 to match the capacity definition.
pow = PowerBin(xy, capacity_spec, target_capacity=target_sn**2)

# Plot the results. We use capacity_scale='sqrt' to display S/N instead of (S/N)^2.
pow.plot(capacity_scale='sqrt', ylabel='S/N')

plt.show()

PowerBin Class

PowerBin Purpose

Performs 2D adaptive spatial binning using Centroidal Power Diagrams.

This class implements the PowerBin algorithm described in Cappellari (2025). It partitions a set of 2D points (pixels) into bins, aiming for a nearly constant capacity per bin (e.g., signal-to-noise squared).

Key advances over classic Voronoi binning include:

  • Centroidal Power Diagram: Produces nearly round, convex, and connected bins, avoiding issues like disconnected or nested bins.

  • Scalability: Uses O(N log N) algorithms, making it practical for datasets with millions of pixels.

  • Stable Construction: Employs a robust heuristic for building the diagram, avoiding numerical fragility.

The algorithm has two main stages:

  1. Bin Accretion: Generates an initial set of bin centers.

  2. Regularization: Iteratively adjusts bin shapes to equalize their capacities.

Parameters

xy: array_like of shape (npix, 2)

Coordinates of the pixels to be binned.

capacity_spec: callable or array_like of shape (npix,)

The rule for calculating capacity, given in one of two forms:

  • Callable: A function fun(indices, *args) -> float that returns the total capacity of a bin containing the pixels at indices. This allows for non-additive capacity definitions (for example with correlated noise).

  • Array-like: A 1D array dens of length npix, where dens[j] is the additive capacity of pixel j. The capacity of a bin is the sum of dens over its member pixels. This is faster.

target_capacity: float

The target capacity value for each bin.

pixelsize: float, optional

The size of a pixel in the input coordinate units. This is used to internally work in pixel units for numerical stability. If None, it is estimated as the median distance to the second-nearest neighbor.

verbose: int, optional

Controls the level of printed output:

  • 0: No output.

  • 1: Basic summary (default).

  • 2: Detailed iteration-by-iteration progress.

  • 3: Same as 2, but also plots the binning at each iteration.

regul: bool, optional

If True (default), performs the iterative regularization step after the initial accretion. If False, only accretion is performed.

args: tuple, optional

Additional positional arguments passed to capacity_spec when it is a callable function.

maxiter: int, optional

Maximum number of iterations for the regularization step (default: 50).

Attributes

xy: ndarray of shape (npix, 2)

The original input pixel coordinates.

bin_num: ndarray of int of shape (npix,)

An array where bin_num[j] gives the index of the bin containing pixel j. This is the primary output for mapping pixels to bins.

pixel_capacity: ndarray of shape (npix,)

The capacity of each individual input pixel, derived from capacity_spec.

bin_capacity: ndarray of shape (nbin,)

The final calculated capacity for each output bin.

xybin: ndarray of shape (nbin, 2)

The coordinates of the Power Diagram generators (bin centers), in the same units as the input xy.

rbin: ndarray of shape (nbin,)

The radii of the Power Diagram generators, in the same units as xy.

npix: ndarray of int of shape (nbin,)

The number of pixels in each bin.

single: ndarray of bool of shape (nbin,)

A boolean array indicating which bins have a single pixel.

rms_frac: float

The fractional root-mean-square scatter of the bin_capacity values, calculated as a percentage for non-single bins.

target_capacity, pixelsize, verbose, args :

Stored values of the corresponding input parameters.

References


License

Copyright (C) 2025-2026 Michele Cappellari E-mail: michele.cappellari_at_physics.ox.ac.uk

Updated versions of this software are available at: https://pypi.org/project/powerbin/

If you use this software in published research, please acknowledge it as: “PowerBin method by Cappellari (2025, MNRAS, 544, 1432)” https://ui.adsabs.harvard.edu/abs/2025MNRAS.544.1432C

This software is provided “as is”, without any warranty of any kind, express or implied.

Permission is granted for:
  • Non-commercial use.

  • Modification for personal or internal use, provided that this copyright notice and disclaimer remain intact and unaltered at the beginning of the file.

All other rights are reserved. Redistribution of the code, in whole or in part, is strictly prohibited without prior written permission from the author.


Changelog

V1.1.11: Oxford, 26 January 2026

  • Fixed possible program stop due to a plotting alpha > 1 bug.

V1.1.10: Oxford, 17 November 2025

  • Updated citation information to reflect publication in MNRAS.

V1.1.9: Oxford, 09 October 2025

  • Added a fast, vectorized code path for additive (array) capacities; callable capacities remain supported.

  • Improved bin‑accretion (incremental centroid and r^2 updates) for lower runtime and memory use.

  • Renamed and clarified key attributes (capacity_spec, pixel_capacity, bin_capacity) and updated docs/examples.

  • Miscellaneous bug fixes and robustness improvements.

V1.0.6: Oxford, 17 September 2025

  • Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

powerbin-1.1.11.tar.gz (67.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

powerbin-1.1.11-py3-none-any.whl (72.1 kB view details)

Uploaded Python 3

File details

Details for the file powerbin-1.1.11.tar.gz.

File metadata

  • Download URL: powerbin-1.1.11.tar.gz
  • Upload date:
  • Size: 67.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.10

File hashes

Hashes for powerbin-1.1.11.tar.gz
Algorithm Hash digest
SHA256 a9861078b3b020c1b58f40657855cfc2cc1ecf8f023fdd267a199a09b06baa3e
MD5 3769d09d073b2ae7c94b0a933c7f99a3
BLAKE2b-256 9d7f0dcbff5fc7d49e2b7d7474cf39a8ede2403d5ff3c10f4fe190bcf29b03a9

See more details on using hashes here.

File details

Details for the file powerbin-1.1.11-py3-none-any.whl.

File metadata

  • Download URL: powerbin-1.1.11-py3-none-any.whl
  • Upload date:
  • Size: 72.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.10

File hashes

Hashes for powerbin-1.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 0b28c860c77ba4eb55d839c280377ef429c4046c98e402b867f6418e6119a6ab
MD5 b043e6fd24d33e4d0a59fd84f857f7ca
BLAKE2b-256 e5d9336e1c45014a66c409430321fe2cbee28cf3d311962288d04eb7c88de178

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page