Skip to main content

Exact-fractional-area zonal statistics over weather grids

Project description

geohalo

geohalo

Exact-fractional-area zonal statistics over regular lat/lon grids.

PyPI Python versions License: MIT Documentation Release workflow Managed by uv


Given a regular lat/lon mesh of gridded values — temperature, precipitation, population density, a land-cover fraction, a satellite band, … (loaded with xarray from GRIB, NetCDF, Zarr, …) — and an arbitrary set of polygons, geohalo reduces the spatial dimensions of the mesh to one value per polygon with sub-cell precision and millisecond-scale aggregation in the hot path.

The expensive geometric work happens once; every subsequent grid collapses to a single sparse · dense matmul.

📖 Full documentation: https://campiohe.github.io/geohalo/

How it works

Aggregation is a linear operator:

aggregates = W @ flat_grid_values

where W ∈ ℝ^(N_polygons × N_cells) is a sparse matrix whose entries are the exact fractional area of cell ∩ polygon weighted by each cell's true surface area on a sphere. W (the Stencil) depends only on the grid topology and the polygon set — not on the grid values — so it is built once (and cacheable) and reused for every slice. See Aggregation as a linear operator.

Install

geohalo targets Python ≥ 3.12.

uv add geohalo            # or: pip install geohalo

Optional extras: redis (the RedisCache backend) and matplotlib (the helpers in geohalo.plot).

Quickstart

import numpy as np
import geopandas as gpd
import xarray as xr
from shapely.geometry import box
import geohalo as ghl

# any regular lat/lon DataArray works; a synthetic field so this runs as-is
lats = np.arange(-25.0, -19.0, 0.25)
lons = np.arange(-50.0, -42.0, 0.25)
lon2d, lat2d = np.meshgrid(lons, lats)
field = 290.0 + 5.0 * np.cos(np.deg2rad(4 * lat2d)) + 0.1 * lon2d

da = xr.DataArray(
    field, dims=("latitude", "longitude"),
    coords={"latitude": lats, "longitude": lons}, name="value",
)

geoms = gpd.GeoSeries(
    [box(-49, -24, -47, -22), box(-47, -24, -45, -22), box(-46, -22, -44, -20)],
    index=["SP", "RJ", "MG"],            # the index holds the keys
)

out = ghl.reduce(da, geoms)                  # hot path; ms-scale
out_fine = ghl.reduce(da, geoms, target_resolution=0.05)   # refine the grid first
# out: xr.DataArray over (..., geom)

The output preserves every non-spatial dim of da (time, ensemble member, band, vertical level, …) and replaces (latitude, longitude) with a single geom dim indexed by the GeoSeries keys.

reduce also accepts an xr.Dataset (every spatial data var is reduced), how={"mean", "sum"}, a weight_key naming a per-cell weight variable, and spherical_correction=False to disable the latitude-area correction.

Documentation

Everything is covered in depth at https://campiohe.github.io/geohalo/:

Performance

The hot path is a single sparse · dense matmul: a 50-member batch over the ~5,570 GADM Brazil L2 municipalities reduces in single-digit milliseconds, and the one-time Stencil precompute is seconds and cacheable. Methodology, full tables, and the fused-operator size win are on the Performance page; re-run the suite with uv run python -m benchmarks.run.

Non-goals

  • No reprojection — EPSG:4326 throughout (grids and polygons).
  • No per-variable cache — the Stencil depends on grid + polygons only.
  • No WGS84-ellipsoidal cell areas — spherical is within ~0.3 % (spherical_correction=False gives planar/equal-area weights).
  • No DAG hierarchies — each child has exactly one parent (tree only).
  • No how={"min", "max"}mean and sum only.

Development

uv sync                                          # install deps
uv run pytest                                    # tests
uv run ruff check .                              # lint
uv run --group docs mkdocs serve                 # preview the docs locally
uv run --group docs python docs/gen_figures.py   # regenerate the doc figures

Docs are built with Material for MkDocs and deployed to https://campiohe.github.io/geohalo/ on every push to main.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geohalo-1.1.0.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geohalo-1.1.0-py3-none-any.whl (27.0 kB view details)

Uploaded Python 3

File details

Details for the file geohalo-1.1.0.tar.gz.

File metadata

  • Download URL: geohalo-1.1.0.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for geohalo-1.1.0.tar.gz
Algorithm Hash digest
SHA256 09b8588884e86f9eb57878c579f7d3b5357d521b3f828b9ef9d5a0a1387a2f92
MD5 ca2d86f2572d78006b38a8848b86f9bb
BLAKE2b-256 4f0ce3446da9fd1b2d239257bc8b4ab6679944480bd2a0e83e045540bc317ee3

See more details on using hashes here.

Provenance

The following attestation bundles were made for geohalo-1.1.0.tar.gz:

Publisher: release.yml on campiohe/geohalo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file geohalo-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: geohalo-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for geohalo-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1f4748c6a0bd6181e498692c386846520fe415bfdc42fdc5a1f3c10412813f47
MD5 1ab44815de3b87935fdf30a4253269cf
BLAKE2b-256 5e075c400694326580a5f675a57ef1d160b85f55fc8d7b589adae7bd73ca0e4e

See more details on using hashes here.

Provenance

The following attestation bundles were made for geohalo-1.1.0-py3-none-any.whl:

Publisher: release.yml on campiohe/geohalo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page