Skip to main content

Area-conserving rebinning of 2D data and images via Sutherland–Hodgman clipping.

Project description

rebinning

Area-conserving rebinning of 2D data. Implemented in Rust, exposed to Python through PyO3 and maturin.

Every input cell is treated as a (possibly transformed) quadrilateral. That quadrilateral is geometrically clipped against each overlapping cell of a regular output grid, and the input value is redistributed (or averaged) across the output bins in proportion to overlap area. Totals are conserved exactly and there are no interpolation kernels involved.

For image resizing (axis-aligned scaling) it is also a fast alternative to OpenCV's cv2.INTER_AREA.

Algorithm overview


Installation

You need a recent Rust toolchain (the build script uses maturin automatically). With uv:

uv sync
uv run maturin develop --uv --release

uv sync sets up .venv/ and installs the Python source as an editable package; maturin develop compiles the Rust extension and drops the _rebinning.*.so into python/rebinning/. Re-run maturin develop --uv --release whenever you edit Rust.

Quick start

Resize an image

import numpy as np
from PIL import Image
import rebinning

img = np.asarray(Image.open("portrait.jpg"))          # (H, W, 3) uint8
face = rebinning.rebin_image(img, 112, 112,
                             keep_aspect=True,
                             crop="center")
Image.fromarray(face).save("portrait_112.png")

rebin_image handles both grayscale (H, W) and multichannel (H, W, C) inputs, preserves the dtype, and clips/rounds integer outputs. With keep_aspect=True it crops the input to the target aspect ratio before resampling (top-left or center).

Resizing a portrait to 112×112 with center crop

Reproduce with examples/resize_portrait.py. The bundled portrait.jpg is a public-domain US Navy photo of Grace Hopper (taken from matplotlib's sample data); swap in any other image to try your own.

Rebin a 2D field through a coordinate transform

import numpy as np
import rebinning

# 100×100 regular input grid carrying some values
nx, ny = 100, 100
input_x = np.arange(nx) + 0.5
input_y = np.arange(ny) + 0.5
values = np.random.default_rng(0).normal(size=(nx, ny))

# rotate every input cell by 30 degrees
import math
c, s = math.cos(math.radians(30)), math.sin(math.radians(30))
def rotate(points):
    x, y = points[..., 0], points[..., 1]
    return np.stack([c*x - s*y, s*x + c*y], axis=-1)

output_x = np.arange(-50, 150) + 0.5
output_y = np.arange(-50, 150) + 0.5

out = rebinning.rebin(
    input_x, input_y, values,
    bin_width_x=1.0, bin_width_y=1.0,
    output_x=output_x, output_y=output_y,
    transform=rotate,
    mode="sum",   # 'sum' conserves totals; 'mean' gives an area-weighted mean
)

The transform is a vectorised numpy callable. It is invoked once with an (Nx, Ny, 4, 2) array of corner positions and must return an array of the same shape. Affine maps, spherical projections, beam-line geometry, anything you can express with numpy works.

Rebinning a 100×100 grid through a 30° rotation

The dashed line in the output panel is the rotated boundary of the original input domain. Reproduce with examples/test_readme_rebin.py.


How the algorithm works

Step 1: Build the input quadrilaterals

The input is a regular grid of values. Each cell is a small rectangle around (input_x[i], input_y[j]). The library computes the four corners and (optionally) passes them through the user-supplied transform. The result is one quadrilateral per input cell, in output coordinates.

Step 2: Clip against each output bin

For each input quad we know its bounding box in output coordinates, so we know which output bins it might overlap. For each of those candidate bins (a small axis-aligned rectangle), we run Sutherland–Hodgman to clip the quad to the bin. Sutherland–Hodgman is just four passes of "drop everything outside one half-plane":

Sutherland-Hodgman in four steps

The algorithm is correct whenever the clip polygon is convex, which a rectangle always is. The subject polygon (our input quad) can be any shape; only the four boundary tests need to know the rectangle's geometry.

The area of the resulting (possibly empty) polygon is computed with the shoelace formula: ½ |Σ (xᵢ yᵢ₊₁ − xᵢ₊₁ yᵢ)|.

Step 3: Redistribute by overlap area

For each input cell we now have a list of (output_bin, overlap_area) pairs. Let A = Σ overlap_area. In mode="sum" we add value × overlap_area / A to each touched output bin. In mode="mean" we accumulate the value weighted by overlap area and divide each output bin by the sum of weights it received.

Redistribution by overlap area

Every overlap area is shared between one input quad and one output bin, so no mass is lost or created. For mode="sum" the operation is provably area-conserving.


Fast path for image resizing

When the transform is axis-aligned (image resizing, where you only scale and translate), every output bin is a rectangle covering a few input pixels. Each output pixel is the area-weighted mean of those input pixels, with weights set by the fractional overlap:

One output bin and its input-pixel weights

The 2D overlap area further factors into a product of two 1D overlaps:

area(input_pixel ∩ output_bin) = Δx × Δy

This makes the operation separable: instead of clipping Nx · Ny · Mx · My quadrilateral/rectangle pairs, we build two small 1D weight matrices W_y ∈ ℝ^{H_out × H_in} and W_x ∈ ℝ^{W_out × W_in}, each row summing to 1, and apply them as two passes over the image:

out[i,j,c] = Σ_{a,b} W_y[i,a] · W_x[j,b] · img[a,b,c]

Separable axis-aligned scaling

Each row of W_y (and W_x) is non-zero only over the short contiguous range of input rows the corresponding output row covers, typically ceil(scale) + 1 entries. rebin_image iterates that range per output row instead of doing a dense matmul over a mostly-zero matrix. The public rebin_axis_weights helper still returns the dense weight matrix for inspection and custom processing.


Comparison with cv2.INTER_AREA

For downscaling, cv2.INTER_AREA is mathematically equivalent to area-weighted rebinning. OpenCV uses a hand-tuned fixed-point routine; we apply the same 1D overlap weights separably, iterating only over the short contiguous range of input bins each output bin touches.

From tests/test_rebin.py:

  • For random uint8 RGB inputs, the two agree to within 1 grey level maximum and <0.1 mean.
  • For integer downscaling factors on float32, the two agree to 1e-4.

Single-threaded wall times for downscaling to 112×112, measured by benchmarks/bench_image.py (best of 9 × 50 timed calls after warm-up, AMD Ryzen 9 7945HX):

input size rebin_image u8 rebin_image f32 cv2.INTER_AREA u8
480×640 0.32 ms 0.27 ms 0.74 ms
1024×1024 0.66 ms 0.54 ms 2.32 ms
2000×2000 1.92 ms 3.05 ms 8.49 ms

Where the libraries differ:

rebin_image cv2.INTER_AREA
Crop to aspect ratio built in (keep_aspect=True) requires a separate cv2.resize step
Boundary handling partial overlap → fractional weight same, in fixed-point
Upscaling falls back to nearest-neighbour not designed for upscaling
Algorithm exposure 1D weight matrix is a public API hidden inside the C implementation

Use rebin_image when you want the same numeric result everywhere (cross-platform parity, training-vs-serving consistency) without adding a second runtime dependency.


API

rebinning.rebin_image(
    image: np.ndarray,                 # (H, W) or (H, W, C)
    out_height: int,
    out_width: int,
    *,
    keep_aspect: bool = True,
    crop: Literal["center", "top-left"] = "center",
    out_dtype = None,
) -> np.ndarray
rebinning.rebin(
    input_x: np.ndarray, input_y: np.ndarray, values: np.ndarray,  # (Nx,), (Ny,), (Nx, Ny)
    *,
    bin_width_x: float, bin_width_y: float,
    output_x: np.ndarray, output_y: np.ndarray,                    # uniformly-spaced
    transform: Callable[[np.ndarray], np.ndarray] | None = None,   # (..., 2) -> (..., 2)
    mode: Literal["sum", "mean"] = "sum",
    skip_partial: bool = False,
) -> np.ndarray
rebinning.rebin_quads(
    quads: np.ndarray,                 # (N, 4, 2), pre-transformed corners
    values: np.ndarray,                # (N,)
    *,
    output_x, output_y,
    mode: Literal["sum", "mean"] = "sum",
    skip_partial: bool = False,
) -> np.ndarray
rebinning.rebin_axis_weights(
    in_n: int, in_bw: float, in_origin: float,
    out_n: int, out_bw: float, out_origin: float,
) -> np.ndarray                        # (out_n, in_n), rows sum to 1

skip_partial=True reproduces the original Fortran reference's behaviour of dropping any input cell whose bounding box crosses the output-grid boundary. The default False clamps the iteration range instead, so partial-overlap cells still contribute their in-grid fraction.


Crate layout

src/
├── lib.rs        # PyO3 module: type marshalling and shape checking
├── clip.rs       # Sutherland–Hodgman edge clip + shoelace area
├── rebin.rs      # The quadrilateral rebinning algorithm (the core)
├── weights.rs    # 1D overlap weights (sparse + dense) for the fast path
└── image.rs      # Separable u8/u16/f32/f64 image resizing using those weights

python/rebinning/
└── __init__.py   # Public Python API: rebin, rebin_image, rebin_quads, rebin_axis_weights

The Rust modules have their own #[cfg(test)] unit tests (clipping & weights); the end-to-end tests live in tests/test_rebin.py and include a comparison against cv2.INTER_AREA when OpenCV is installed.

Development Tasks

# Set up the venv and build the extension:
uv sync
uv run maturin develop --uv --release

# Run the Python tests:
uv run python tests/test_rebin.py

# Run the Rust unit tests (clip + weights):
cargo test --lib

# Rebuild in place after editing Rust:
uv run maturin develop --uv --release

Provenance

I initially wrote the algorithm for myself as a Fortran 90 routine for grazing-incidence X-ray scattering data reduction. That code is preserved verbatim under legacy/ for reference; the Rust port is a clean rewrite around Sutherland–Hodgman clipping and is no longer specific to scattering geometry.

License

Released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rebinning-0.1.0.tar.gz (770.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rebinning-0.1.0-cp39-abi3-win_amd64.whl (185.6 kB view details)

Uploaded CPython 3.9+Windows x86-64

rebinning-0.1.0-cp39-abi3-musllinux_1_2_x86_64.whl (524.4 kB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ x86-64

rebinning-0.1.0-cp39-abi3-musllinux_1_2_aarch64.whl (485.7 kB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ ARM64

rebinning-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (290.2 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

rebinning-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (275.3 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

rebinning-0.1.0-cp39-abi3-macosx_11_0_arm64.whl (278.0 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

rebinning-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl (288.1 kB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file rebinning-0.1.0.tar.gz.

File metadata

  • Download URL: rebinning-0.1.0.tar.gz
  • Upload date:
  • Size: 770.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rebinning-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0d6fad11b5a9d81b5a345e9b893957e2fc1712cdad2c95fc89605852a381098f
MD5 e7e4f28c31e7dfa0a8f6f5ca7c4eabd4
BLAKE2b-256 c81c4cd99222cc8d651026aa925e6f0e25b6d53b1fa920438378076ad4f075bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebinning-0.1.0.tar.gz:

Publisher: release.yml on DomiDre/rebinning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rebinning-0.1.0-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: rebinning-0.1.0-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 185.6 kB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rebinning-0.1.0-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 e0113fae69efd20a1dc954dd3852a0b67de5194048737963385cee620fb57230
MD5 573ee5141397165a440dc682100e95d4
BLAKE2b-256 8699309986536985559a6af07042f141655c88992ebcf33c67d9501afb112a04

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebinning-0.1.0-cp39-abi3-win_amd64.whl:

Publisher: release.yml on DomiDre/rebinning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rebinning-0.1.0-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for rebinning-0.1.0-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 168ab23ea88bedd2aef633302d72861f17e94efcaf6012603d40a7e914513959
MD5 5e5b0964fd20b34085f7fcf859e9acaf
BLAKE2b-256 f4df143b7d6f627f8aae640849f992f1d2f4d9554ca5882680e60f5b29d0021d

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebinning-0.1.0-cp39-abi3-musllinux_1_2_x86_64.whl:

Publisher: release.yml on DomiDre/rebinning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rebinning-0.1.0-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for rebinning-0.1.0-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 5f69da3ebbe046a2c471eac7aea11e88f807b0388ea980c1355212a24c95dbe5
MD5 017b384f243fb7f7a8c0cd583151c3c2
BLAKE2b-256 3c57ad17304537eab16b8d3d582d5440bf0d191ed9bcc1aa9476677879d46558

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebinning-0.1.0-cp39-abi3-musllinux_1_2_aarch64.whl:

Publisher: release.yml on DomiDre/rebinning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rebinning-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rebinning-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 093267c258084b3ec5dbb8d5761d59a9c861971bb6fa21e6119adbce0694ebed
MD5 bb2c7d9ddd7bfb51310ebe716f0cc14b
BLAKE2b-256 62a53b9f615765fb0b613223eaf68e87597be790d6f513b90decaa5ec53ce356

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebinning-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on DomiDre/rebinning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rebinning-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for rebinning-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b334e84fd4389df477a4664c67bac0f7a067e5bbacbeeee21e1cb694a403a642
MD5 856bed52001acdd15d70bbbf258d226d
BLAKE2b-256 b5b28cce112f99c2caf8143fc9085ec38ad612b8cedfc5cfcbc5046c8e43bd96

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebinning-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on DomiDre/rebinning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rebinning-0.1.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rebinning-0.1.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 439f22a550c0e4f80a3281f8312cbbf97c3059db7d73b8fd078ab8b9ff8c2009
MD5 cda134860eb1474a2733bc66c5d0bdca
BLAKE2b-256 c2def46fdc1c68fe8e609515fa63a9ab719a10cbcad495ad354d540a85cd0263

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebinning-0.1.0-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on DomiDre/rebinning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rebinning-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for rebinning-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 10cab478e977fbef6425547540acb81ebc3b74084a3f38477171404cb0205518
MD5 9399c76dc3275c78aeecf9b86adfba43
BLAKE2b-256 360f46aecc6f51ee403103b366e7fc4dfda19653c3c16bad6ae3026f6addb870

See more details on using hashes here.

Provenance

The following attestation bundles were made for rebinning-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on DomiDre/rebinning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page