Tiles router for xpublish
Project description
xpublish-tiles
Web mapping plugins for Xpublish
Project Overview
This project contains a set of web mapping plugins for Xpublish - a framework for serving xarray datasets via HTTP APIs.
The goal of this project is to transform xarray datasets to raster, vector and other types of tiles, which can then be served via HTTP APIs. To do this, the package implements a set of xpublish plugins:
xpublish_tiles.xpublish.tiles.TilesPlugin: An OGC Tiles conformant plugin for serving raster, vector and other types of tiles.xpublish_tiles.xpublish.wms.WMSPlugin: An OGC Web Map Service conformant plugin for serving raster, vector and other types of tiles.
Development
Sync the environment with uv
uv sync
Run the type checker
uv run ty check
Run the tests
uv run pytest tests
Run setup tests (create local datasets, these can be deployed using the CLI)
uv run pytest --setup
CLI Usage
The package includes a command-line interface for quickly serving datasets with tiles and WMS endpoints:
uv run xpublish-tiles [OPTIONS]
Options
--port PORT: Port to serve on (default: 8080)--dataset DATASET: Dataset to serve (default: global)global: Generated global dataset with synthetic dataair: Tutorial air temperature dataset from xarray tutorialhrrr: High-Resolution Rapid Refresh datasetpara: Parameterized dataseteu3035: European dataset in ETRS89 / LAEA Europe projectioneu3035_hires: High-resolution European datasetifs: Integrated Forecasting System datasetcurvilinear: Curvilinear coordinate datasetsentinel: Sentinel-2 dataset (without coordinates)global-6km: Global dataset at 6km resolutionxarray://<tutorial_name>: Load any xarray tutorial dataset (e.g.,xarray://rasm)local://<dataset_name>: Load dataset from local icechunk repository at/tmp/tiles-icechunk/(datasets created withuv run pytest --setup)local:///path/to/repo::<dataset_name>: Load dataset from custom icechunk repository path- For Arraylake datasets: specify the dataset name in {arraylake_org}/{arraylake_dataset} format (requires Arraylake credentials)
--branch BRANCH: Branch to use for Arraylake or icechunk datasets (default: main)--group GROUP: Group to use for Arraylake datasets (default: '')--cache: Enable icechunk cache for Arraylake and local icechunk datasets (default: enabled)--spy: Run benchmark requests with the specified dataset for performance testing--bench-suite: Run benchmarks for all local datasets and tabulate results (requiresuv run pytest --setupto create local datasets first)--concurrency INT: Number of concurrent requests for benchmarking (default: 12)--where CHOICE: Where to run benchmark requests (choices: local, local-booth, prod; default: local)local: Start server on localhost and run benchmarks against itlocal-booth: Run benchmarks against existing localhost server (no server startup)prod: Run benchmarks against production server
--log-level LEVEL: Set the logging level for xpublish_tiles (choices: debug, info, warning, error; default: warning)
[!TIP] To use local datasets (e.g.,
local://ifs,local://para_hires), first create them withuv run pytest --setup. This creates icechunk repositories at/tmp/tiles-icechunk/.
Examples
# Serve synthetic global dataset on default port 8080
xpublish-tiles
# Serve air temperature tutorial dataset on port 9000
xpublish-tiles --port 9000 --dataset air
# Serve built-in test datasets
xpublish-tiles --dataset hrrr
xpublish-tiles --dataset para
xpublish-tiles --dataset eu3035_hires
# Load xarray tutorial datasets
xpublish-tiles --dataset xarray://rasm
xpublish-tiles --dataset xarray://ersstv5
# Serve locally stored datasets (first create them with `uv run pytest --setup`)
xpublish-tiles --dataset local://ifs
xpublish-tiles --dataset local://para_hires
# Serve local icechunk data from custom path
xpublish-tiles --dataset local:///path/to/my/repo::my_dataset
# Serve Arraylake dataset with specific branch and group
xpublish-tiles --dataset earthmover-public/aifs-outputs --branch main --group 2025-04-01/12z
# Run benchmark with a specific dataset
xpublish-tiles --dataset local://para_hires --spy
# Run benchmark with custom concurrency and against production
xpublish-tiles --dataset para --spy --concurrency 20 --where prod
# Run benchmark suite for all local datasets (creates tabulated results)
xpublish-tiles --bench-suite
# Enable debug logging
xpublish-tiles --dataset hrrr --log-level debug
Benchmarking
The CLI includes a benchmarking feature that can be used to test tile server performance:
# Run benchmark with a specific dataset (starts server automatically)
xpublish-tiles --dataset local://para_hires --spy
# Run benchmark against existing localhost server
xpublish-tiles --dataset para --spy --where local-booth
# Run benchmark against production server with custom concurrency
xpublish-tiles --dataset para --spy --where prod --concurrency 8
# Run benchmark suite for all local datasets
xpublish-tiles --bench-suite
Benchmark Suite
The --bench-suite option runs performance tests on all available local datasets and creates a tabulated summary of results. This is useful for comparing performance across different dataset types and configurations.
Prerequisites: You must first create the local test datasets:
uv run pytest --setup
The benchmark suite will test the following local datasets:
ifs: Integrated Forecasting System datasethrrr: High-Resolution Rapid Refresh datasetpara_hires: High-resolution parameterized dataseteu3035_hires: High-resolution European datasetutm50s_hires: High-resolution UTM Zone 50S datasetsentinel: Sentinel-2 datasetglobal-6km: Global dataset at 6km resolution
The output includes a performance table showing tiles processed, success/failure rates, wall time, average request time, and requests per second for each dataset.
Individual Benchmarking
The --spy flag enables benchmarking mode. The benchmarking behavior depends on the --where option:
--where local(default): Starts the tile server and automatically runs benchmark requests against it--where local-booth: Runs benchmarks against an existing localhost server (doesn't start a new server)--where prod: Runs benchmarks against a production server
The benchmarking process:
- Warms up the server with initial tile requests
- Makes concurrent tile requests (configurable with
--concurrency, default: 12) to test performance - Uses dataset-specific benchmark tiles or falls back to global tiles
- Automatically exits after completing the benchmark run
- Uses appropriate colorscale ranges based on dataset attributes
Once running, the server provides:
- Tiles API at
http://localhost:8080/tiles/ - WMS API at
http://localhost:8080/wms/ - Interactive API documentation at
http://localhost:8080/docs
An example tile url:
http://localhost:8080/tiles/WebMercatorQuad/4/4/14?variables=2t&style=raster/viridis&colorscalerange=280,300&width=256&height=256&valid_time=2025-04-03T06:00:00
Where 4/4/14 represents the tile coordinates in {z}/{y}/{x}
Integration Examples
Deployment notes
- Make sure to limit
NUMBA_NUM_THREADS; this is used for rendering categorical data with datashader. - The first invocation of a render will block while datashader functions are JIT-compiled. Our attempts to add a precompilation step to remove this have been unsuccessful.
Configuration
Settings can be configured via environment variables or config files. The async loading setting has been moved to the config system (use async_load in config files or XPUBLISH_TILES_ASYNC_LOAD environment variable).
XPUBLISH_TILES_NUM_THREADS: int- controls the size of the threadpoolXPUBLISH_TILES_ASYNC_LOAD: bool- whether to use Xarray's async loadingXPUBLISH_TILES_TRANSFORM_CHUNK_SIZE: int- when transforming coordinates, do so by submitting (NxN) chunks to the threadpool.XPUBLISH_TILES_DETECT_APPROX_RECTILINEAR: bool- detect whether a curvilinear grid is approximately rectilinearXPUBLISH_TILES_RECTILINEAR_CHECK_MIN_SIZE: int- check for rectilinearity if array.shape > (N, N)XPUBLISH_TILES_MAX_RENDERABLE_SIZE: int- do not attempt to load or render arrays with size greater than this valueXPUBLISH_TILES_DEFAULT_PAD: int- how much to pad a selection on either side
Performance Notes
For context, the rendering pipeline is:
- Receive dataset
dsandQueryParamsfrom the plugin. - Grab
GridSystemfordsand requested DataArray. The inference here is complex and is cached internally using theds.attrs['_xpublish_id']and the requestedDataArray.name. Be sure to set this attribute to a unique string. - Based on the grid system, the data are subset to the bounding box using slices. For datasets with a geographic CRS, padding is applied to the slicers if needed to account for the meridian or anti-meridian and depending on the dataset's longitude convention (0→360 or -180→180).
- This plugin supports parsing multiple "grid mappings" for a single DataArray. If present, we pick coordinates corresponding to the output CRS. If not, we look to see if there are coordinates corresponding to
epsg:4326, if not, we use the native coordinates. - Coordinates are transformed to the output CRS, if needed. This is usually a very slow step. For performance,
a. We reimplement the
epsg:4326 -> epsg:3857transformation because it is separable (xis fully determined bylongitude, andyis fully determined by latitude). This allows us to preserve the regular or rectilinear nature of the grid if possible. b. If (a) is not possible, we broadcast the input coordinates against each other, then cut up the coordinates in to chunks and process them in a threadpool usingpyproj. - Xarray's new
load_asyncis used to load the data in to memory. - Next we check whether the grid, if curvilinear, may be approximated by a rectilinear grid. a. The Rectilinear mesh codepath is datashader can be 3-10X faster than the Curvilinear codepath, so this approximation is worth it. b. We replicate the logic in datashader that constructs an array that contains output pixel id for each each input pixel -- this is done for each axis. c. If the difference between these arrays, constructed from the curvilinear and rectilinear meshes, differs by one pixel, then we approximate the grid as rectilinear. This threshold is pretty tight, and requires some experimentation to loosen further. If loosening, we will need to pad appropriately. d. Realistically this optimization is triggered on high resolution data at zoom levels where the grid distortion isn't very high.
Performance recommendations:
- Make sure
_xpublish_idis set inDataset.attrs. - If CRS transformations are a bottleneck,
- Assign reprojected coordinates for the desired output CRS using multiple grid mapping variables. This will take reprojection time down to 0.
- See if you can approximate the coordinate system with rectilinear coordinates as much as possible. This triggers a much faster rendering pathway in datashader.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xpublish_tiles-0.1.22.tar.gz.
File metadata
- Download URL: xpublish_tiles-0.1.22.tar.gz
- Upload date:
- Size: 4.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18a14d7126fc35ca80ae48be3a9441bca2e66213cce9e19e85dc0cd43db07ca0
|
|
| MD5 |
c56fc3ffb730c785197d41650f20c2b2
|
|
| BLAKE2b-256 |
e148c9632c303ecb66ac9efcc236a4510e23ed4ff9a5fc27ebaf0112ce6742d8
|
Provenance
The following attestation bundles were made for xpublish_tiles-0.1.22.tar.gz:
Publisher:
publish.yml on earth-mover/xpublish-tiles
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xpublish_tiles-0.1.22.tar.gz -
Subject digest:
18a14d7126fc35ca80ae48be3a9441bca2e66213cce9e19e85dc0cd43db07ca0 - Sigstore transparency entry: 529240603
- Sigstore integration time:
-
Permalink:
earth-mover/xpublish-tiles@4faa14209f36b1f73be871f1a9b7733a7c4b9332 -
Branch / Tag:
refs/tags/0.1.22 - Owner: https://github.com/earth-mover
-
Access:
internal
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4faa14209f36b1f73be871f1a9b7733a7c4b9332 -
Trigger Event:
push
-
Statement type:
File details
Details for the file xpublish_tiles-0.1.22-py3-none-any.whl.
File metadata
- Download URL: xpublish_tiles-0.1.22-py3-none-any.whl
- Upload date:
- Size: 106.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e35792d9f66f78ba2c7e06f3d1236bf257ed7e24efaafc1e45ad2d26a37a2df9
|
|
| MD5 |
0069c829bce621a4d5833d006c743896
|
|
| BLAKE2b-256 |
8adcd66bf14bcc6a1ca9ead27befcce640c1f3ddb1c4d93d8ed7127698df5510
|
Provenance
The following attestation bundles were made for xpublish_tiles-0.1.22-py3-none-any.whl:
Publisher:
publish.yml on earth-mover/xpublish-tiles
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
xpublish_tiles-0.1.22-py3-none-any.whl -
Subject digest:
e35792d9f66f78ba2c7e06f3d1236bf257ed7e24efaafc1e45ad2d26a37a2df9 - Sigstore transparency entry: 529240611
- Sigstore integration time:
-
Permalink:
earth-mover/xpublish-tiles@4faa14209f36b1f73be871f1a9b7733a7c4b9332 -
Branch / Tag:
refs/tags/0.1.22 - Owner: https://github.com/earth-mover
-
Access:
internal
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4faa14209f36b1f73be871f1a9b7733a7c4b9332 -
Trigger Event:
push
-
Statement type: