Exact-fractional-area zonal statistics over weather grids
Project description
geohalo
Exact-fractional-area zonal statistics over regular lat/lon grids.
Given a regular lat/lon mesh of gridded values — temperature, precipitation,
population density, a land-cover fraction, a satellite band, … (loaded with
xarray from GRIB, NetCDF, Zarr, …) — and an arbitrary set of polygons,
geohalo reduces the spatial dimensions of the mesh to one value per
polygon with sub-cell precision and millisecond-scale aggregation
in the hot path.
The expensive geometric work happens once; every subsequent grid collapses to a single sparse · dense matmul.
📖 Full documentation: https://campiohe.github.io/geohalo/
How it works
Aggregation is a linear operator:
aggregates = W @ flat_grid_values
where W ∈ ℝ^(N_polygons × N_cells) is a sparse matrix whose entries are the
exact fractional area of cell ∩ polygon weighted by each cell's true
surface area on a sphere. W (the Stencil) depends only on the grid topology
and the polygon set — not on the grid values — so it is built once (and
cacheable) and reused for every slice. See
Aggregation as a linear operator.
Install
geohalo targets Python ≥ 3.12.
uv add geohalo # or: pip install geohalo
Optional extras: redis (the RedisCache backend) and matplotlib
(the helpers in geohalo.plot).
Quickstart
import numpy as np
import geopandas as gpd
import xarray as xr
from shapely.geometry import box
import geohalo as ghl
# any regular lat/lon DataArray works; a synthetic field so this runs as-is
lats = np.arange(-25.0, -19.0, 0.25)
lons = np.arange(-50.0, -42.0, 0.25)
lon2d, lat2d = np.meshgrid(lons, lats)
field = 290.0 + 5.0 * np.cos(np.deg2rad(4 * lat2d)) + 0.1 * lon2d
da = xr.DataArray(
field, dims=("latitude", "longitude"),
coords={"latitude": lats, "longitude": lons}, name="value",
)
geoms = gpd.GeoSeries(
[box(-49, -24, -47, -22), box(-47, -24, -45, -22), box(-46, -22, -44, -20)],
index=["SP", "RJ", "MG"], # the index holds the keys
)
out = ghl.reduce(da, geoms) # hot path; ms-scale
out_fine = ghl.reduce(da, geoms, target_resolution=0.05) # refine the grid first
# out: xr.DataArray over (..., geom)
The output preserves every non-spatial dim of da (time, ensemble member,
band, vertical level, …) and replaces (latitude, longitude) with a single
geom dim indexed by the GeoSeries keys.
reduce also accepts an xr.Dataset (every spatial data var is reduced),
how={"mean", "sum"}, a weight_key naming a per-cell weight variable, and
spherical_correction=False to disable the latitude-area correction.
Documentation
Everything is covered in depth at https://campiohe.github.io/geohalo/:
- Quickstart — runnable examples: batching, datasets, weights, NaN handling, refining, rollups.
- Concepts — linear operator · the stencil · why exact fractional coverage · latitude correction · mean-preserving downscaling · the fused reduce operator · NaN-aware & weighted reduction · hierarchical rollups
- Guides — caching the precompute · resampling grids
- API reference
Performance
The hot path is a single sparse · dense matmul: a 50-member batch over the
~5,570 GADM Brazil L2 municipalities reduces in single-digit milliseconds,
and the one-time Stencil precompute is seconds and cacheable. Methodology,
full tables, and the fused-operator size win are on the
Performance page; re-run the
suite with uv run python -m benchmarks.run.
Non-goals
- No reprojection — EPSG:4326 throughout (grids and polygons).
- No per-variable cache — the
Stencildepends on grid + polygons only. - No WGS84-ellipsoidal cell areas — spherical is within ~0.3 %
(
spherical_correction=Falsegives planar/equal-area weights). - No DAG hierarchies — each child has exactly one parent (tree only).
- No
how={"min", "max"}—meanandsumonly.
Development
uv sync # install deps
uv run pytest # tests
uv run ruff check . # lint
uv run --group docs mkdocs serve # preview the docs locally
uv run --group docs python docs/gen_figures.py # regenerate the doc figures
Docs are built with Material for MkDocs
and deployed to https://campiohe.github.io/geohalo/ on every push to main.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file geohalo-1.0.0.tar.gz.
File metadata
- Download URL: geohalo-1.0.0.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64b88bd417c28713332b4920568b16b7c507e04fa0fa84f4dc9779ed6f7c4c2e
|
|
| MD5 |
1d68f91a047ea16c431a4fde7ed3d023
|
|
| BLAKE2b-256 |
b82b3a28a3adcd08a315526b7fe8a865c05aced38a8ce01ebc2b7a80c3c88c48
|
Provenance
The following attestation bundles were made for geohalo-1.0.0.tar.gz:
Publisher:
release.yml on campiohe/geohalo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
geohalo-1.0.0.tar.gz -
Subject digest:
64b88bd417c28713332b4920568b16b7c507e04fa0fa84f4dc9779ed6f7c4c2e - Sigstore transparency entry: 1673710698
- Sigstore integration time:
-
Permalink:
campiohe/geohalo@a443f9c6704c3e25caf18ed97b674e3e61d13e2e -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/campiohe
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a443f9c6704c3e25caf18ed97b674e3e61d13e2e -
Trigger Event:
push
-
Statement type:
File details
Details for the file geohalo-1.0.0-py3-none-any.whl.
File metadata
- Download URL: geohalo-1.0.0-py3-none-any.whl
- Upload date:
- Size: 26.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb78bf390ed322158cf322b9a270f972bec4c1145dad943a3853d21e32e4c40d
|
|
| MD5 |
da1e58362021d520404b4ac7b6203f6b
|
|
| BLAKE2b-256 |
31a1c040692c8020d6eddd8979e3e67d90661b78919682863f5241f7b87feca4
|
Provenance
The following attestation bundles were made for geohalo-1.0.0-py3-none-any.whl:
Publisher:
release.yml on campiohe/geohalo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
geohalo-1.0.0-py3-none-any.whl -
Subject digest:
cb78bf390ed322158cf322b9a270f972bec4c1145dad943a3853d21e32e4c40d - Sigstore transparency entry: 1673710710
- Sigstore integration time:
-
Permalink:
campiohe/geohalo@a443f9c6704c3e25caf18ed97b674e3e61d13e2e -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/campiohe
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a443f9c6704c3e25caf18ed97b674e3e61d13e2e -
Trigger Event:
push
-
Statement type: