Python bindings for the shed watershed delineation engine

These details have not been verified by PyPI

Project links

Project description

pyshed

Python bindings for the shed watershed delineation engine. pyshed loads HFX-format v0.2.1 datasets and returns watershed polygons from a (lat, lon) outlet. HFX v0.1 datasets no longer load. The full native stack (GDAL, PROJ, GEOS, libtiff, SQLite, and more) is bundled inside the wheel — no system install required.

Install

pip install pyshed

Platform support (v0.2.0): Apple Silicon macOS only (macosx_11_0_arm64). Linux, Intel macOS, and Windows wheels are not yet built — community contributions are welcome. See CONTRIBUTING.md if you want to help port the build.

Quickstart

import pyshed

engine = pyshed.Engine("/path/to/hfx/dataset")
result = engine.delineate(lat=47.3769, lon=8.5417)
print(result.area_km2)

Snapping options belong on the constructor, not on delineate:

# Correct — snap_radius is an Engine constructor kwarg
engine = pyshed.Engine("/path/to/hfx/dataset", snap_radius=5000)
result = engine.delineate(lat=47.3769, lon=8.5417)

Geometry repair defaults to the pure-Rust topology cleaner. Pass repair_geometry="gdal" to opt into the GDAL repairer; repair_geometry="auto", "clean", False, and None all use the default cleaner.

Engine also accepts dataset root URLs backed by the object-store integration:

local_engine = pyshed.Engine("/data/hfx/rhine")
file_url_engine = pyshed.Engine("file:///data/hfx/rhine")
s3_engine = pyshed.Engine("s3://bucket/path/to/hfx/rhine")
r2_engine = pyshed.Engine(
    "https://<account>.r2.cloudflarestorage.com/<bucket>/path/to/hfx/rhine"
)
public_r2_engine = pyshed.Engine(
    "https://basin-delineations-public.upstream.tech/global/hfx"
)

Remote dataset sessions cache manifest.json and graph.arrow under ~/.cache/hfx/<fabric_name>/<adapter_version>/ by default. Set HFX_CACHE_DIR=/path/to/cache before constructing pyshed.Engine(...) to use a different cache root. Parquet artifacts are read with object-store range reads; they are not copied into the cache wholesale.

GDAL raster URI and configuration plumbing is wired through the Python engine, but public Cloudflare R2 raster access still depends on the target bucket, credentials, and GDAL driver behavior. Verify the specific remote raster dataset you plan to use.

Verbose mode

Enable structured log output from both the Python and Rust layers:

import pyshed

pyshed.set_log_level("info")
engine = pyshed.Engine("https://basin-delineations-public.upstream.tech/global/hfx")
# INFO lines stream during manifest/graph/catchment loading
result = engine.delineate(lat=47.3769, lon=8.5417)

Valid levels: "trace", "debug", "info", "warn"/"warning", and "error"/"critical". Set PYSHED_LOG to one of those values to opt in at import time.

Speeding up repeated delineations

Enable the in-memory Parquet column-chunk cache to avoid redundant range reads across overlapping watersheds:

engine = pyshed.Engine(
    "https://basin-delineations-public.upstream.tech/global/hfx",
    parquet_cache=True,
    parquet_cache_max_mb=512,
)

The cache is enabled by default for remote dataset URLs and disabled by default for local paths. parquet_cache_max_mb defaults to 512 when caching is enabled. Cache state is per-Engine instance and is not persisted to disk.

Batch delineation with progress

import pyshed

# tqdm is a user dependency — not bundled with pyshed
from tqdm.auto import tqdm

url = "https://basin-delineations-public.upstream.tech/global/hfx"
engine = pyshed.Engine(url)

outlets = [
    {"lat": 47.3769, "lon": 8.5417},
    {"lat": 46.9480, "lon": 7.4474},
    {"lat": 48.1351, "lon": 11.5820},
]

bar = tqdm(total=len(outlets), unit="outlet")

def on_progress(event):
    bar.update(1)
    bar.set_postfix(status=event.get("status"), ms=event.get("duration_ms"))

results = engine.delineate_batch(outlets, progress=on_progress)
bar.close()

The progress callback receives a dict with keys index, total, lat, lon, duration_ms, status ("ok" or "error"), plus n_catchments on success and error on failure. Exceptions raised inside the callback are swallowed and logged; they do not interrupt the batch.

Staged delineation

delineate() is the convenience composition of the staged API:

level = engine.select_level(selection=pyshed.LevelSelection.FINEST)
outlet = engine.resolve_outlet(level, lat=47.3769, lon=8.5417)
upstream = engine.traverse(outlet)
units = engine.pre_merge_units(upstream)
refinement = engine.refine(outlet, units)
dissolved = engine.dissolve(units, refinement)
result = engine.compose_result(outlet, upstream, units, refinement, dissolved)

LevelSelection.FINEST is the only level selection in 0.2.0; multi-level selection is on the roadmap.

result matches engine.delineate(lat=47.3769, lon=8.5417). The merged result exposes final geometry_wkb, final area_km2, and light per-unit metadata (id, level, area_km2, up_area_km2, outlet). Whole per-unit geometry is available only on PreMergeDrainageUnits.unit_geometry_wkb.

R3 note: pre-merge units are whole source drainage units, including the whole terminal unit. If terminal refinement is applied, summing or unioning those whole units is not the same as the merged area_km2 or geometry_wkb.

GeoParquet export

Exports are explicit writer-object calls and write complete batches:

basin_writer = pyshed.BasinGeoParquetWriter()
basin_writer.write(engine, "basins.parquet", [result], basin_ids=["rhine-basel"])

bundle_writer = pyshed.UnitBundleGeoParquetWriter()
bundle_writer.write(engine, "units.parquet", [units], [refinement])

BasinGeoParquetWriter writes one merged basin row per result. basin_ids are caller-owned, filesystem-safe identifiers. Omitting basin_ids is allowed only with allow_default_basin_id=True and exactly one result, where the terminal unit ID becomes the basin ID.

UnitBundleGeoParquetWriter writes one row per pre-merge drainage unit. Unit rows use dataset-local unit_id, include terminal_unit_id and delineation grouping columns, and store whole-unit geometry.

Default delineation labels are {fabric_name}/{fabric_version}/{method}. The default method is d8-best-effort when refinement is enabled and no-refine when refine=False. The actual outcome is stored separately in refinement_status.

API Reference

For the full developer-oriented API surface, including argument types, return types, and the exception hierarchy, see API.md.

What it does

Resolves the outlet coordinate to a terminal HFX unit (via snap.parquet or point-in-polygon on catchments.parquet).
Walks the upstream graph in graph.parquet collecting all contributing units.
Optionally refines the terminal unit geometry using flow_dir.tif / flow_acc.tif rasters when present.
Returns a dissolved MultiPolygon + geodesic area in km².
Bundles GDAL / PROJ / GEOS / libtiff / SQLite — no system GDAL install needed.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.4

Jun 20, 2026

0.2.3

Jun 7, 2026

0.2.2

Jun 6, 2026

This version

0.2.1

Jun 6, 2026

0.2.0

Jun 5, 2026

0.1.11

May 6, 2026

0.1.10

May 4, 2026

0.1.9

May 4, 2026

0.1.8

Apr 27, 2026

0.1.7

Apr 21, 2026

0.1.3

Apr 21, 2026

0.1.2

Apr 18, 2026

0.1.1

Apr 17, 2026

0.1.0

Apr 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyshed-0.2.1-cp39-abi3-macosx_11_0_arm64.whl (22.6 MB view details)

Uploaded Jun 6, 2026 CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file pyshed-0.2.1-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: pyshed-0.2.1-cp39-abi3-macosx_11_0_arm64.whl
Upload date: Jun 6, 2026
Size: 22.6 MB
Tags: CPython 3.9+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pyshed-0.2.1-cp39-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`c3ef9cda95509eba96615dc625a9c5870d876c5faf718bc64f9a4caeef31e9d0`
MD5	`aa36f17812c3934096a551e1c84431c0`
BLAKE2b-256	`7919c5279700b18a230adac01744463f55b85087bc3c41a1597cd88db8e52b3a`

See more details on using hashes here.

pyshed 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pyshed

Install

Quickstart

Verbose mode

Speeding up repeated delineations

Batch delineation with progress

Staged delineation

GeoParquet export

API Reference

What it does

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes