Skip to main content

Python bindings for Whitebox backend tool runtime

Project description

Whitebox Workflows for Python

Whitebox Workflows for Python is the Python interface for the Whitebox backend runtime.

The API is in active modernization, with emphasis on:

  • clearer data-object ergonomics,
  • better discoverability and IntelliSense,
  • memory-first workflows,
  • stronger interoperability with Python data tooling.

Table of contents

Current API highlights

  • Harmonized metadata access:
    • Raster.metadata()
    • Vector.metadata()
    • Lidar.metadata()
  • Vector attribute readability aliases:
    • schema(), attributes(), attribute()
    • update_attributes(), update_attribute(), add_field()
  • Dataset-aware vector write/copy for multifile formats.
  • Raster and NumPy bridge:
    • Raster.to_numpy(...)
    • Raster.from_numpy(...)
  • Lidar and NumPy bridge:
    • Lidar.to_numpy(...)
    • Lidar.from_numpy(...)
    • Lidar.to_numpy_chunks(...)
    • Lidar.from_numpy_chunks(...)

Preferred API conventions

Use these conventions as the default style for new code:

  1. Prefer harmonized metadata methods:
  • Raster.metadata()
  • Vector.metadata()
  • Lidar.metadata()
  1. Use canonical vector attribute methods:
  • schema()/attributes()/attribute() for reads.
  • update_attributes()/update_attribute()/add_field() for writes.
  1. Use explicit namespaces for utility functions and avoid flat helper patterns:
  • wbe.projection.* for CRS/projection utilities.
  • wbe.topology.* for geometry/topology utilities.
  • wbe.topology_tools for topology tool-category access.
  1. Prefer object-first workflows:
  • read_* -> inspect metadata() -> run category tools -> write_*.
  1. Use strict output controls when reproducibility matters:
  • Set strict_format_options=True and explicit format/layout/compression options.

Intent-driven entry points

If you are starting from a task instead of an API name, use these jump points:

Migration quick map

Common updates from removed legacy style to the canonical API:

Legacy style Current style
v.attribute_fields() v.schema()
v.get_attributes(i) v.attributes(i)
v.get_attribute(i, field) v.attribute(i, field)
v.set_attributes(i, values) v.update_attributes(i, values)
v.set_attribute(i, field, value) v.update_attribute(i, field, value)
v.add_attribute_field(...) v.add_field(...)

Notes:

  • Removed pre-release aliases were dropped to reduce API ambiguity.
  • Canonical methods improve readability and consistency across object types.

Tool reference docs

Design and migration notes:

Development install

From workspace root:

./scripts/dev_python_install.sh

To build the Python extension with Pro support compiled in:

./scripts/dev_python_install.sh --pro

You can also enable the same behavior with an environment variable:

WBW_PYTHON_ENABLE_PRO=1 ./scripts/dev_python_install.sh

This performs an editable install via maturin for the wbw_python crate.

Quick smoke test

python3 crates/wbw_python/examples/python_import_smoke_test.py

Interoperability-focused smoke test (optional dependencies):

python3 crates/wbw_python/examples/interop_roundtrip_smoke_test.py

Optional packages used when available:

  • numpy
  • rasterio
  • geopandas
  • shapely
  • pyproj

Recommended examples

Suggested run order for new users:

Order Script Focus
1 examples/quickstart_harmonized_api.py Raster/vector/lidar metadata quickstart
2 examples/current_api_data_handling_demo.py End-to-end object read/process/write
3 examples/sensor_bundle_overview.py Supported sensor bundle families, band/measurement access, and preview outputs
4 examples/vector_attributes_harmonized_api.py Vector schema + attribute access aliases
5 examples/vector_multifile_write_demo.py Shapefile/MapInfo dataset-aware outputs
6 examples/raster_numpy_roundtrip.py 2D NumPy roundtrip
7 examples/raster_numpy_multiband_roundtrip.py 3D NumPy roundtrip (bands-first and rows-cols-bands)
examples/licensing_offline_example.py Offline signed entitlement startup
examples/licensing_floating_online_example.py Floating license startup

Run from this directory:

python examples/quickstart_harmonized_api.py

Canonical workflows

The following five workflows are the preferred foundation for new user docs and examples. Each includes one end-to-end reference script.

Workflow Preferred pattern End-to-end example
Raster analysis read_raster -> tools via wbe.raster/wbe.terrain/wbe.hydrology -> write_raster examples/current_api_data_handling_demo.py
Vector attribute and geometry processing read_vector -> schema/attributes/update_* -> write_vector examples/vector_attributes_harmonized_api.py
Lidar processing read_lidar -> tools via wbe.lidar -> write_lidar examples/current_api_data_handling_demo.py
Reprojection pipeline reproject_raster/reproject_vector/reproject_lidar -> targeted write examples/current_api_data_handling_demo.py
Interop-first exchange wbw object -> ecosystem bridge -> wbw object re-ingest examples/interop_roundtrip_smoke_test.py

Network service-area workflows

Service-area analysis now supports both per-origin and merged polygon coverage, plus optional mode-aware costing.

Per-origin polygons (default):

import whitebox_workflows as wb

wbe = wb.WbEnvironment()

wbe.network_service_area(
  input="network.gpkg",
  origins="origins.gpkg",
  max_cost=15.0,
  output_mode="polygons",
  output="service_area_per_origin.gpkg",
)

Merged coverage polygons by ring:

import whitebox_workflows as wb

wbe = wb.WbEnvironment()

wbe.network_service_area(
  input="network.gpkg",
  origins="origins.gpkg",
  max_cost=15.0,
  ring_costs="5,10,15",
  output_mode="polygons",
  polygon_merge_origins=True,
  output="service_area_merged_by_ring.gpkg",
)

Mode-aware service area using per-mode speeds:

import whitebox_workflows as wb

wbe = wb.WbEnvironment()

wbe.network_service_area(
  input="network.gpkg",
  origins="origins.gpkg",
  max_cost=20.0,
  output_mode="edges",
  mode_field="MODE",
  default_mode_speed=1.0,
  mode_speed_overrides="walk:1.4,drive:12.0",
  allowed_modes="walk,drive",
  output="service_area_mode_aware.gpkg",
)

Recommended API pattern

import whitebox_workflows as wb

wbe = wb.WbEnvironment()
wbe.working_directory = '/path/to/data'

dem = wbe.read_raster('dem.tif')
meta = dem.metadata()
print(meta.rows, meta.columns, meta.nodata)

filled = wbe.hydrology.fill_depressions(dem)
accum = wbe.hydrology.d8_flow_accum(filled)
wbe.write_raster(accum, 'flow_accum.tif')

Quick start examples by data type

Raster

# Read and inspect
dem = wbe.read_raster('dem.tif')
meta = dem.metadata()
print(f'Size: {meta.rows} x {meta.columns}, CRS: {meta.epsg_code}')

# Apply a tool
slope = wbe.terrain.slope(dem)

# Write result (default GeoTIFF behavior uses backend defaults)
wbe.write_raster(slope, 'slope_default.tif')

# Extensionless path defaults to COG-style GeoTIFF
wbe.write_raster(slope, 'slope_default')  # writes slope_default.tif

# Write with explicit GeoTIFF/COG controls
wbe.write_raster(
  slope,
  'slope_cog.tif',
  options={
    'compress': True,
    'strict_format_options': True,
    'geotiff': {
      'compression': 'deflate',
      'bigtiff': False,
      'layout': 'cog',
      'tile_size': 512,
    },
  },
)

# Batch write with one option profile
wbe.write_rasters(
  [slope],
  ['slope_tiled.tif'],
  options={
    'compress': False,
    'geotiff': {
      'layout': 'tiled',
      'tile_width': 256,
      'tile_height': 256,
    },
  },
)

Raster output controls

WbEnvironment.write_raster(...) and WbEnvironment.write_rasters(...) accept an options dictionary for output control.

Recommended vs advanced:

  • Recommended: begin with default write behavior or minimal compress/layout settings.
  • Advanced: use strict_format_options=True and explicit codec/layout/tile controls when exact output reproducibility is required.

Supported keys:

  • compress (True/False): convenience toggle for GeoTIFF compression.
    • True maps to deflate.
    • False maps to uncompressed GeoTIFF.
  • strict_format_options (True/False): when True, using GeoTIFF options on non-GeoTIFF outputs raises an error.
  • geotiff (dict): GeoTIFF/COG-specific controls.
    • compression: none, deflate, lzw, packbits, jpeg, webp, jpegxl
    • bigtiff: True or False
    • layout: standard, stripped, tiled, cog
    • rows_per_strip (for stripped)
    • tile_width, tile_height (for tiled)
    • tile_size (for cog)

Notes:

  • For GeoTIFF outputs (.tif, .tiff), options are applied directly.

  • For non-GeoTIFF outputs, GeoTIFF-specific options are ignored unless strict_format_options=True.

  • Backend GeoTIFF default compression is Deflate when no explicit override is provided.

  • If no output extension is provided (for example "my_file"), write_raster(...) defaults to COG-style GeoTIFF output at "my_file.tif".

Common output profiles

# 1) Standard GeoTIFF (default backend behavior)
wbe.write_raster(result, 'out_standard.tif')

# 2) Explicit stripped GeoTIFF
wbe.write_raster(
  result,
  'out_stripped.tif',
  options={
    'geotiff': {
      'layout': 'stripped',
      'rows_per_strip': 32,
    },
  },
)

# 3) Explicit tiled GeoTIFF
wbe.write_raster(
  result,
  'out_tiled.tif',
  options={
    'geotiff': {
      'layout': 'tiled',
      'tile_width': 256,
      'tile_height': 256,
    },
  },
)

# 4) Cloud-Optimized GeoTIFF (COG)
wbe.write_raster(
  result,
  'out_cog.tif',
  options={
    'compress': True,
    'geotiff': {
      'layout': 'cog',
      'tile_size': 512,
      'bigtiff': False,
    },
  },
)

Sensor bundle (Sentinel-2)

# Open a Sentinel-2 SAFE bundle
s2 = wbe.read_sentinel2('S2A_MSIL2A_20250714T160911_N0511_R097_T17TNH_20250714T221309.SAFE')

# Inspect bundle metadata
print(s2.family)
print(s2.tile_id(), s2.processing_level(), s2.cloud_cover_percent())
print(s2.list_band_keys())

# Read individual bands by key
red = s2.read_band('B04')
green = s2.read_band('B03')
blue = s2.read_band('B02')

# Build and persist composites using the bundle-aware helpers
rgb = wbe.true_colour_composite(s2.bundle_root, output='sentinel2_rgb.tif')
nir = wbe.false_colour_composite(s2.bundle_root, output='sentinel2_nir.tif')

# Or use the Bundle convenience delegates (same result)
rgb = s2.true_colour_composite(wbe, output='sentinel2_rgb.tif')

For a broader multi-family example, see examples/sensor_bundle_overview.py.

Vector

# Read and inspect
roads = wbe.read_vector('roads.shp')
schema = roads.schema()
print(f'Geometry: {schema.geometry_type}, Fields: {len(schema.fields)}')

# Access attributes
for i in range(min(3, roads.num_records())):
    attrs = roads.attributes(i)
    print(f'Record {i}: {attrs}')

# Process and persist
centroids = wbe.vector.geometry_processing.centroid_vector(roads)
wbe.write_vector(centroids, 'roads_centroids.shp')

# Extensionless path defaults to GeoPackage
wbe.write_vector(centroids, 'roads_centroids')  # writes roads_centroids.gpkg

Vector output controls

WbEnvironment.write_vector(...), WbEnvironment.read_vector(...), and WbEnvironment.read_vectors(...) accept optional options dictionaries.

Recommended vs advanced:

  • Recommended: rely on extension-driven defaults and add only minimal format options.

  • Advanced: enable strict_format_options=True and tune GeoParquet/OSM-specific controls for reproducibility and performance. Supported keys:

  • strict_format_options (True/False): when True, using format-specific vector options on non-matching formats raises an error.

  • geoparquet (dict): GeoParquet write controls.

    • compression: none, snappy, gzip, lz4, zstd, brotli
    • max_rows_per_group
    • data_page_size_limit
    • write_batch_size
    • data_page_row_count_limit
  • osmpbf (dict): OSM PBF read controls.

    • highways_only (True/False)
    • named_ways_only (True/False)
    • polygons_only (True/False)
    • include_tag_keys (list of strings)

Notes:

  • GeoParquet options are applied only when writing .parquet outputs.
  • OSM PBF options are applied only when reading .osm.pbf inputs.
  • When strict_format_options=False, non-applicable vector options are ignored.

Common vector option profiles

# 1) GeoParquet write with explicit compression/row-group controls
wbe.write_vector(
  roads,
  'roads.parquet',
  options={
    'strict_format_options': True,
    'geoparquet': {
      'compression': 'zstd',
      'max_rows_per_group': 250000,
      'write_batch_size': 8192,
    },
  },
)

# 2) OSM PBF read with filtering controls
roads_from_osm = wbe.read_vector(
  'region.osm.pbf',
  options={
    'osmpbf': {
      'highways_only': True,
      'named_ways_only': True,
      'include_tag_keys': ['name', 'highway', 'maxspeed'],
    },
  },
)

Lidar

# Read and inspect
las = wbe.read_lidar('survey.las')
meta = las.metadata()
print(f'Points: {meta.num_points}, CRS: {meta.crs_epsg}')

# Apply a tool
norms = wbe.lidar.calculate_point_normals(las)

# Write result
wbe.write_lidar(norms, 'survey_normals.las')

# Extensionless path defaults to COPC
wbe.write_lidar(norms, 'survey_normals')  # writes survey_normals.copc.laz

# Optional format-specific write controls
wbe.write_lidar(
  norms,
  'survey_normals.copc.laz',
  options={
    'copc': {
      'max_points_per_node': 75000,
      'max_depth': 8,
      'node_point_ordering': 'hilbert',
    },
  },
)

wbe.write_lidar(
  norms,
  'survey_normals.laz',
  options={
    'laz': {
      'chunk_size': 25000,
      'compression_level': 7,
    },
  },
)

Lidar output controls

WbEnvironment.write_lidar(...) accepts optional options dictionaries.

Recommended vs advanced:

  • Recommended: use default write behavior for .las/.laz/.copc.laz unless specific delivery constraints apply.
  • Advanced: tune LAZ chunk/compression and COPC octree controls for large-scene optimization and deterministic packaging.

Supported keys:

  • laz (dict): LAZ write controls.
    • chunk_size (positive integer)
    • compression_level (0-9)
  • copc (dict): COPC write controls.
    • max_points_per_node (positive integer)
    • max_depth (positive integer)
    • node_point_ordering: auto, morton, hilbert

Notes:

  • LAZ options are applied when writing .laz outputs.
  • COPC options are applied when writing .copc.laz outputs.
  • Non-applicable lidar options are ignored for other output formats.

Common lidar option profiles

### Chunked lidar streaming

For large point clouds, use chunked column workflows to avoid materializing the
entire point matrix at once.

Recommended flow:
- Read chunks with `to_numpy_chunks(...)`.
- Apply vectorized edits per chunk.
- Write with `from_numpy_chunks(...)`.

```python
lidar = wbe.read_lidar('survey.las')
cols = ['x', 'y', 'z', 'classification']

chunks = lidar.to_numpy_chunks(chunk_size=200_000, cols=cols)
for chunk in chunks:
  high = chunk[:, 2] > 250.0
  chunk[high, 3] = 6

edited = wb.Lidar.from_numpy_chunks(
  chunks,
  base=lidar,
  cols=cols,
  output_path='survey_chunked_reclassified.laz',
)

Notes:

  • The chunked write path uses shared core streaming rewrite for LAS/LAZ outputs.
  • Callback-driven chunk decode is supported via to_numpy_chunks(..., callback=...) when you prefer processing without collecting a chunk list.

LAZ write with tuned compression/chunking

wbe.write_lidar( norms, 'survey_normals.laz', options={ 'laz': { 'chunk_size': 25000, 'compression_level': 7, }, }, )

COPC write with octree controls

wbe.write_lidar( norms, 'survey_normals.copc.laz', options={ 'copc': { 'max_points_per_node': 75000, 'max_depth': 8, 'node_point_ordering': 'hilbert', }, }, )


### Extensionless defaults (all data objects)

When `output_path` has no extension:

- `write_raster(...)` writes COG-style GeoTIFF to `*.tif`
- `write_vector(...)` writes GeoPackage to `*.gpkg`
- `write_lidar(...)` writes COPC to `*.copc.laz`

`write_lidar(...)` also accepts optional per-format write controls through
`options={...}`.

Examples:

```python
wbe.write_raster(raster, 'my_file')  # my_file.tif (COG-style default)
wbe.write_vector(vector, 'my_file')  # my_file.gpkg
wbe.write_lidar(lidar, 'my_file')    # my_file.copc.laz

Progress and feedback

Long-running tools can report progress via a callback function:

filled = wbe.hydrology.fill_depressions(
  input_dem=dem.file_path,
  callback=wb.callbacks.print_progress,
)

You can also import the root-level alias:

filled = wbe.hydrology.fill_depressions(
  input_dem=dem.file_path,
  callback=wb.print_progress,
)

For custom verbosity, use the callback factory:

progress_cb = wb.callbacks.make_progress_printer(
  min_increment=5,
  show_messages=True,
)

filled = wbe.hydrology.fill_depressions(
  input_dem=dem.file_path,
  callback=progress_cb,
)

The standard callback also parses percentages embedded in message text when an explicit numeric progress field is missing (for example: "Progress (loop 1 of 2): 50%").

You can also wrap progress in a more structured way (e.g., with a progress bar). If you implement your own callback, handle all event payload shapes (JSON string, dictionary, or object attributes):

import json
import re
from tqdm import tqdm

PERCENT_IN_MESSAGE = re.compile(r"(-?\d+(?:\.\d+)?)\s*%")

def normalize_event(event):
  parsed = event
  if isinstance(event, str):
    try:
      parsed = json.loads(event)
    except json.JSONDecodeError:
      return None, None, event

  if isinstance(parsed, dict):
    return parsed.get("type"), parsed.get("percent"), parsed.get("message")

  return (
    getattr(parsed, "type", None),
    getattr(parsed, "percent", None),
    getattr(parsed, "message", None),
  )

def infer_percent_from_message(message):
  if message is None:
    return None
  m = PERCENT_IN_MESSAGE.search(str(message))
  return float(m.group(1)) if m else None

class ProgressTracker:
  def __init__(self):
    self.pbar = tqdm(total=100, desc="Running")
    self.last = 0

  def __call__(self, event):
    event_type, raw_percent, message = normalize_event(event)
    if raw_percent is None and message:
      raw_percent = infer_percent_from_message(message)

    if event_type == "progress" and raw_percent is not None:
      percent = float(raw_percent)
      if percent <= 1.0:
        percent *= 100.0
      pct = max(0, min(100, int(percent)))
      self.pbar.update(max(0, pct - self.last))
      self.last = max(self.last, pct)

    if message:
      self.pbar.set_postfix_str(str(message), refresh=False)

    if self.last >= 100:
      self.pbar.close()

tracker = ProgressTracker()
result = wbe.hydrology.fill_depressions(input_dem=dem.file_path, callback=tracker)

Memory-first execution model

Default behavior is memory-first for intermediates:

  • If an output path is omitted, operations return memory-backed objects.
  • Persist only when needed with write_raster, write_vector, or write_lidar.
  • This reduces unnecessary disk I/O in chained workflows.

Example:

tmp = wbe.raster.sqrt(dem)
out = wbe.raster.log10(tmp)
wbe.write_raster(out, 'sqrt_log10.tif')

ArcPy-style I/O vs wbw_python memory-first chaining

ArcPy workflows often write each intermediate to disk as input -> tool -> output, which can increase I/O overhead in multi-step pipelines.

In wbw_python, intermediate objects are memory-backed unless an explicit output path is provided. This means you can chain operations in memory and persist only final artifacts, reducing unnecessary disk writes.

Reprojection patterns

dst_epsg = 32618

dem_utm = wbe.reproject_raster(
  dem,
  dst_epsg=dst_epsg,
  resample='bilinear',
)
roads = wbe.read_vector('roads.shp')
roads_utm = wbe.reproject_vector(roads, dst_epsg=dst_epsg)

wbe.write_raster(dem_utm, 'dem_utm.tif')
wbe.write_vector(roads_utm, 'roads_utm.shp')

Raster reprojection controls

wbe.reproject_raster(...) accepts several optional controls beyond dst_epsg. Most commonly used:

  • resample: nearest, bilinear, cubic, average
  • cols, rows: force output grid shape
  • x_res, y_res: force output cell size
  • extent: (xmin, ymin, xmax, ymax) output extent
  • nodata_policy, antimeridian_policy, grid_size_policy, destination_footprint

Example:

dem_utm = wbe.reproject_raster(
  dem,
  dst_epsg=32618,
  resample='cubic',
  x_res=10.0,
  y_res=10.0,
)

Reprojection best practices

  • Preserve precision: Use high-precision resampling ('bilinear' or 'cubic') for continuous data; 'nearest' for categorical.
  • Verify CRS: Always inspect metadata CRS values before and after reprojection to confirm the transform.
  • CRS mismatch: If input CRS is unknown or incorrect, call set_crs_epsg() before reprojection.
  • Memory-first chaining: Reprojection returns memory-backed objects; persist with write_raster or write_vector.
  • Coordinate order: EPSG defines lat/lon order; Whitebox always uses lon/lat internal. Transforms are applied automatically.

Projection utilities

WbEnvironment now exposes lightweight CRS helpers for common projection tasks:

  • wbe.projection.to_ogc_wkt(epsg)
  • wbe.projection.identify_epsg(crs_text)
  • wbe.projection.reproject_points(points, src_epsg, dst_epsg)
  • wbe.projection.reproject_point(x, y, src_epsg, dst_epsg)
# EPSG -> WKT
wkt_3857 = wbe.projection.to_ogc_wkt(3857)

# WKT/CRS text -> EPSG (or None)
epsg = wbe.projection.identify_epsg(wkt_3857)

# Reproject XY points (list[dict])
pts_wgs84 = [
  {'x': -79.3832, 'y': 43.6532},
  {'x': -73.5673, 'y': 45.5017},
]
pts_utm18 = wbe.projection.reproject_points(pts_wgs84, src_epsg=4326, dst_epsg=32618)

# Single-point convenience helper
pt_utm18 = wbe.projection.reproject_point(-79.3832, 43.6532, src_epsg=4326, dst_epsg=32618)

Topology utilities

WbEnvironment also exposes narrow WKT-focused topology helpers:

  • wbe.topology.intersects_wkt(a_wkt, b_wkt)
  • wbe.topology.contains_wkt(a_wkt, b_wkt)
  • wbe.topology.within_wkt(a_wkt, b_wkt)
  • wbe.topology.touches_wkt(a_wkt, b_wkt)
  • wbe.topology.disjoint_wkt(a_wkt, b_wkt)
  • wbe.topology.crosses_wkt(a_wkt, b_wkt)
  • wbe.topology.overlaps_wkt(a_wkt, b_wkt)
  • wbe.topology.covers_wkt(a_wkt, b_wkt)
  • wbe.topology.covered_by_wkt(a_wkt, b_wkt)
  • wbe.topology.relate_wkt(a_wkt, b_wkt)
  • wbe.topology.distance_wkt(a_wkt, b_wkt)
  • wbe.topology.vector_feature_relation(a_vector, a_feature_index, b_vector, b_feature_index)
  • wbe.topology.is_valid_polygon_wkt(wkt)
  • wbe.topology.make_valid_polygon_wkt(wkt, epsilon=1e-9)
  • wbe.topology.buffer_wkt(wkt, distance)

Tool-category access remains available via wbe.topology_tools (or wbe.category('topology')).

a = 'POLYGON((0 0,10 0,10 10,0 10,0 0))'
b = 'POINT(5 5)'

print(wbe.topology.contains_wkt(a, b))
print(wbe.topology.intersects_wkt(a, b))
print(wbe.topology.relate_wkt(a, b))
print(wbe.topology.distance_wkt(a, b))

invalid = 'POLYGON((0 0,4 4,4 0,0 4,0 0))'
fixed = wbe.topology.make_valid_polygon_wkt(invalid)
buf = wbe.topology.buffer_wkt('LINESTRING(0 0, 10 0)', 1.5)

# Compare specific features directly from vector objects
roads_rel = wbe.topology.vector_feature_relation(roads, 0, roads, 1)
print(roads_rel['intersects'], roads_rel['distance'], roads_rel['relate'])

NumPy interoperability

arr = dem.to_numpy(dtype='float64')
arr = arr + 1.0
new_dem = wb.Raster.from_numpy(arr, dem, output_path='dem_plus1.tif')

Multiband workflows support both (bands, rows, cols) and (rows, cols, bands) 3D arrays.

Rasterio interoperability

Rasterio interoperability is best approached through GeoTIFF exchange when you want to reuse Rasterio profiles, windows, masking, or block-aware I/O.

import rasterio

# Export a wbw raster to a Rasterio-friendly format
wbe.write_raster(dem, 'dem_for_rasterio.tif')

with rasterio.open('dem_for_rasterio.tif') as src:
  arr = src.read(1)
  profile = src.profile

# Example Rasterio-side processing
arr = arr * 1.05

profile.update(dtype='float32', count=1)
with rasterio.open('dem_rasterio_processed.tif', 'w', **profile) as dst:
  dst.write(arr.astype('float32'), 1)

# Bring result back into wbw_python
dem_processed = wbe.read_raster('dem_rasterio_processed.tif')

GeoPandas interoperability

For vector workflows, write to a GeoPandas-friendly dataset (e.g., GeoPackage), process with GeoPandas/Shapely, then read back into wbw_python.

import geopandas as gpd

wbe.write_vector(roads, 'roads_for_gpd.gpkg')
gdf = gpd.read_file('roads_for_gpd.gpkg')

# Example GeoPandas processing
gdf['length_m'] = gdf.length
gdf = gdf[gdf['length_m'] > 25.0]

gdf.to_file('roads_gpd_filtered.gpkg', driver='GPKG')
roads_filtered = wbe.read_vector('roads_gpd_filtered.gpkg')

Shapely interoperability

Shapely integrates naturally with GeoPandas geometry columns; this is a convenient path for advanced geometry operations before returning data to wbw_python.

import geopandas as gpd
from shapely import simplify

wbe.write_vector(streams, 'streams_for_shapely.gpkg')
gdf = gpd.read_file('streams_for_shapely.gpkg')

# Example Shapely operation
gdf['geometry'] = gdf.geometry.apply(lambda geom: simplify(geom, tolerance=2.0))

gdf.to_file('streams_simplified.gpkg', driver='GPKG')
streams_simplified = wbe.read_vector('streams_simplified.gpkg')

xarray/rioxarray interoperability

Use rioxarray for labeled raster workflows (coordinates, lazy loading, xarray ops), then write results back to GeoTIFF for wbw_python ingestion.

import rioxarray as rxr

wbe.write_raster(dem, 'dem_for_xarray.tif')
da = rxr.open_rasterio('dem_for_xarray.tif').squeeze(drop=True)

# Example xarray computation
da_smooth = da.rolling(x=3, y=3, center=True).mean()

da_smooth.rio.to_raster('dem_xarray_smoothed.tif')
dem_smoothed = wbe.read_raster('dem_xarray_smoothed.tif')

pyproj interoperability

Use pyproj when you need explicit CRS inspection, custom transformation pipelines, or CRS comparisons alongside wbw_python metadata.

from pyproj import CRS

src_epsg = dem.metadata().epsg_code
src_crs = CRS.from_epsg(src_epsg)
dst_crs = CRS.from_epsg(32618)

print('Source:', src_crs.to_string())
print('Destination:', dst_crs.to_string())

dem_utm = wbe.reproject_raster(dem, dst_epsg=dst_crs.to_epsg(), resample='bilinear')

Interoperability strategy

  • In-memory numeric exchange: use to_numpy(...) / from_numpy(...).
  • Rich raster ecosystem tools: exchange via GeoTIFF (write_raster / read_raster).
  • Rich vector ecosystem tools: exchange via GeoPackage/Shapefile (write_vector / read_vector).
  • Keep wbw_python as the geoprocessing engine and use ecosystem libraries where they are strongest.

Interoperability behavior matrix

This table summarizes current Phase 1 behavior for common ecosystem bridges.

Bridge Primary path Spatial metadata handling Attribute/CRS handling Copy vs view behavior
NumPy Raster.to_numpy() / Raster.from_numpy() Array carries values only; georeferencing comes from base raster in from_numpy N/A (raster array exchange) Materialized array copy on export/import boundary
Rasterio write_raster(...) -> rasterio.open(...) -> read_raster(...) GeoTIFF metadata persists via file profile/transform/CRS N/A (raster exchange) File-based copy boundary
GeoPandas write_vector(...) -> gpd.read_file(...) -> read_vector(...) Geometry + CRS preserved by container format (recommended: GPKG) Tabular attributes round-trip through file driver support File-based copy boundary
Shapely Through GeoPandas geometry workflows Geometry handled by GeoPandas/Shapely object model Attributes managed by GeoPandas dataframe columns In-memory object copies under GeoPandas/Shapely semantics
xarray/rioxarray write_raster(...) -> rxr.open_rasterio(...) -> .rio.to_raster(...) -> read_raster(...) CRS/transform preserved through rioxarray raster metadata N/A (raster exchange) File-based copy boundary; xarray ops may create derived arrays
pyproj metadata().epsg_code with pyproj.CRS/transform tools CRS interpretation and transform pipelines handled by pyproj N/A (CRS utility interoperability) No raster/vector payload transfer unless combined with file exchange

Interoperability copy-vs-view notes

  • to_numpy() returns a materialized NumPy array intended for safe downstream mutation.
  • from_numpy() writes array values into a new raster using the provided base raster for geospatial context.
  • File-based ecosystem bridges (Rasterio, GeoPandas, rioxarray) are explicit copy boundaries by design.
  • When lossless round-trip behavior matters, prefer stable containers (.tif for raster, .gpkg for vector) and verify metadata after re-ingest with metadata().

Engineering detail note:

  • A deeper internal matrix with follow-up test/documentation targets is tracked in docs/internal/wbw_py_interop_behavior_matrix.md.

Supported file formats

The Python API format support comes from backend crates:

Raster (via wbraster)

Read/write support includes:

  • GeoTIFF / BigTIFF / COG (.tif, .tiff)
  • JPEG2000 / GeoJP2 (.jp2)
  • GeoPackage raster (.gpkg)
  • ENVI (.hdr with sidecar data files)
  • ER Mapper (.ers)
  • Esri ASCII (.asc, .grd)
  • Esri Binary Grid (.adf workspace)
  • GRASS ASCII (.asc, .txt)
  • Idrisi (.rdc, .rst)
  • PCRaster (.map)
  • SAGA (.sgrd, .sdat)
  • Surfer GRD (.grd)
  • Zarr (.zarr)

Satellite sensor bundles (read-only)

wbraster also supports read-only satellite sensor bundle ingestion. These are package-level readers (bundle metadata + band/measurement/asset resolution), not generic raster write targets.

Supported bundle families:

  • Sentinel-2 SAFE
  • Sentinel-1 SAFE
  • Landsat Collection bundles
  • ICEYE bundles
  • PlanetScope bundles
  • SPOT/Pleiades DIMAP bundles
  • Maxar/WorldView bundles
  • RADARSAT-2 bundles
  • RCM bundles

Vector (via wbvector)

Read/write support includes:

  • Shapefile (.shp + sidecars)
  • GeoPackage (.gpkg)
  • GeoJSON (.geojson)
  • FlatGeobuf (.fgb)
  • GML (.gml)
  • GPX (.gpx)
  • KML (.kml)
  • MapInfo Interchange (.mif + .mid)

Additional feature-gated formats in wbvector:

  • GeoParquet (.parquet)
  • KMZ (.kmz)
  • OSM PBF (.osm.pbf, read-only)

LiDAR (via wblidar)

Read/write support includes:

  • LAS
  • LAZ
  • COPC
  • PLY
  • E57

Licensing overview

The runtime supports open and licensed modes.

  • Open mode: instantiate WbEnvironment() directly.
  • Signed entitlement mode: bootstrap from signed JSON or file.
  • Floating license mode: online provider-verified activation using from_floating_license_id(...).

See:

Licensing and Pro workflows

This section focuses on day-to-day patterns for integrating licensing into production scripts, notebooks, services, and plugin-style applications.

1) Choose a startup mode

  • Open mode: best for open-tier workflows and development where Pro tools are not required.
  • Signed entitlement mode: best when users can provide a signed offline entitlement.
  • Floating license mode: best when you want online lease activation against the provider service.

2) Keep initialization centralized

Use a single startup function that creates and validates the environment once, then pass that WbEnvironment instance through your pipeline. Choose the factory that matches your deployment:

import whitebox_workflows as wb

# ---- Open mode (no license required) ----
wbe = wb.WbEnvironment()                    # include_pro=False, tier='open'

# ---- Floating license (online lease) ----
wbe = wb.WbEnvironment.from_floating_license_id(
  floating_license_id='fl_12345',
  include_pro=True,
  fallback_tier='open',
  provider_url='https://license.example.com',
  machine_id='machine-01',
  customer_id='customer-abc',
)
# Tip: provider_url can also be supplied by environment variable WBW_LICENSE_PROVIDER_URL.

# ---- Signed entitlement (offline, from file) ----
wbe = wb.WbEnvironment.from_signed_entitlement_file(
    entitlement_file='./signed_entitlement.json',
    public_key_kid='k1',
    public_key_b64url='REPLACE_WITH_PROVIDER_PUBLIC_KEY',
    include_pro=True,
    fallback_tier='open',
)

# ---- Signed entitlement (offline, from string) ----
wbe = wb.WbEnvironment.from_signed_entitlement_json(
    signed_entitlement_json=my_entitlement_json_string,
    public_key_kid='k1',
    public_key_b64url='REPLACE_WITH_PROVIDER_PUBLIC_KEY',
    include_pro=True,
    fallback_tier='open',
)

For complete runnable examples see:

Recent Checkpoints

  • 2026-04-10: Restored package loading by aligning the compiled module path with the package layout (whitebox_workflows.whitebox_workflows).
  • 2026-04-10: Restored memory-first object workflows for dynamic category calls, including object inputs, working-directory-relative outputs, memory-backed unary raster chaining, and typed single-output returns.
  • 2026-04-10: Added typed multi-output coercion for dynamic category calls so multi-output raster/vector/lidar tools return data objects instead of raw output-path dictionaries.

3) Check Pro tool visibility at startup

Verify that expected Pro tools are actually available before entering a Pro workflow branch. Use this as a guard when include_pro=True but the entitlement may have fallen back to open tier.

pro_tools = {'raster_power', 'sar_coregistration'}   # tools that require Pro
available = set(wbe.list_tools())
missing = pro_tools - available
if missing:
    raise RuntimeError(
        f'Pro entitlement active but required tools are missing: {sorted(missing)}'
    )

4) Gate Pro workflows explicitly

Treat Pro execution as a deliberate branch in your application logic. Return a result from both sides so callers do not need to know which path was taken.

def run_backscatter_correction(wbe, image, use_pro: bool):
    if use_pro:
        # Pro branch: full refined radiometric correction.
        coregistered = wbe.sar_tools.sar_coregistration(image)
        return wbe.sar_tools.refined_lee_filter(coregistered)
    # Open fallback: basic spatial filter only.
    return wbe.image_tools.lee_filter(image)

Keeping the fallback explicit makes it easy to audit which capabilities require a license and to test open-mode coverage in CI without a Pro entitlement.

5) Operational recommendations

  • Keep secrets and signed entitlement payloads out of source control.
  • Prefer configuration-driven startup (env vars or config file) over hard-coded license values.
  • In CI, run open-mode smoke tests by default and isolate Pro tests to approved environments.
  • For long-running jobs with floating licenses, include retry and renewal-aware error handling.

Discovery APIs

tools = wbe.list_tools()
categories = wbe.categories()
rs_tools = wbe.remote_sensing.list_tools()
info = wbe.describe_tool('slope')
matches = wbe.search_tools('flow accumulation')

Subcategory Browsing (Autocomplete-Friendly)

Large categories expose optional subcategory groupings for easier discovery in editors:

# Category -> subcategory -> tool
out1 = wbe.remote_sensing.filters.canny_edge_detection(input='image.tif')
out2 = wbe.raster.overlay_math.add(input1='a.tif', input2='b.tif')
out3 = wbe.terrain.derivatives.slope(dem='dem.tif', units='degrees')

# Introspection helpers
print(wbe.remote_sensing.list_subcategories())
print(wbe.terrain.derivatives.list_tools())

Compatibility note: direct category tool access still works (for example, wbe.terrain.slope(...)).

other remains available as wbe.other, but wbe.categories() omits it when there are no currently visible tools in that bucket.

IntelliSense in VS Code

If completions are stale, ensure VS Code uses the same interpreter where wbw_python is installed.

  1. Run Python: Select Interpreter.
  2. Select your .venv-wbw interpreter.
  3. Reload window.
  4. Restart language server.

Optional workspace pin in .vscode/settings.json:

{
  "python.defaultInterpreterPath": "/Users/<you>/Documents/programming/Rust/whitebox_next_gen/.venv-wbw/bin/python"
}

If needed, remove legacy global path overrides such as:

  • python.analysis.extraPaths
  • python.autoComplete.extraPaths

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

whitebox_workflows-2.0.3-cp39-abi3-win_amd64.whl (18.8 MB view details)

Uploaded CPython 3.9+Windows x86-64

whitebox_workflows-2.0.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.9 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

whitebox_workflows-2.0.3-cp39-abi3-macosx_11_0_arm64.whl (18.1 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file whitebox_workflows-2.0.3-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for whitebox_workflows-2.0.3-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ddabc5a6bdf25089e22516b526af47dcb612755d7af26c932128e718e85b5427
MD5 753d57ec8b2de989fd0b77baf6d4d971
BLAKE2b-256 800bf01d17abbee11ecf5819a031753fbe57f19151aa88a2a9b63a6fa9882518

See more details on using hashes here.

Provenance

The following attestation bundles were made for whitebox_workflows-2.0.3-cp39-abi3-win_amd64.whl:

Publisher: wbw-python-pro-build.yml on jblindsay/wbtools_pro

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file whitebox_workflows-2.0.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for whitebox_workflows-2.0.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bb0605a8f3577ed18e3e19abd05fc2f2f215e8e92d0816f36b4c83bb7f521f44
MD5 341078127a355ed6ef82a3415786e949
BLAKE2b-256 549c7477322c1e10cfabd81c463f711aab8d9a77055794b6560ea46a7b6f087d

See more details on using hashes here.

Provenance

The following attestation bundles were made for whitebox_workflows-2.0.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: wbw-python-pro-build.yml on jblindsay/wbtools_pro

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file whitebox_workflows-2.0.3-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for whitebox_workflows-2.0.3-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 745216f41f2251110dbc4d73173d0efd58c0274ef7fc4537780f1ef3635bd1f8
MD5 643f38d0122da0259aa38ec5caf460fe
BLAKE2b-256 a6c3bc82c11fbc10730355f6e2109438d6f41eae4c171f875070edea4f42b2c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for whitebox_workflows-2.0.3-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: wbw-python-pro-build.yml on jblindsay/wbtools_pro

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page