Lightweight multiscale zarr
Project description
topozarr - lightweight multiscale zarr pyramids
Python library to create multiscale Zarr pyramids for usage with zarr-layer.
Attempts to follow the GeoZarr spec.
- multiscales — pyramid structure and resolution levels
- proj: — coordinate reference system (CRS)
- spatial: — affine transform, bounding box, and dimension names
Warning: experimental
Usage
Installation
You can install the tutorial optional dependency group to run this example.
uv add 'topozarr[tutorial]'
# or
pip install 'topozarr[tutorial]'
Example
import xarray as xr
import xproj # for crs assignment
from topozarr.coarsen import create_pyramid
# Load the air_temperature Xarray tutorial dataset
ds = xr.tutorial.open_dataset('air_temperature', chunks="auto")
ds = ds.proj.assign_crs(spatial_ref="EPSG:4326")
print(ds)
pyramid = create_pyramid(
ds,
levels=2,
x_dim="lon",
y_dim="lat",
method="mean", # "mean" (default) | "max" | "min" | "sum"
)
print(pyramid.encoding)
print(pyramid.dt)
Visualization hints
Use layer_hints to embed colormap and color range hints for zarr-layer directly in the pyramid metadata:
from topozarr.metadata import ZarrLayerVarConfig
pyramid = create_pyramid(
ds,
levels=2,
x_dim="lon",
y_dim="lat",
layer_hints={"air": ZarrLayerVarConfig(colormap="blues", clim=[230, 310])},
)
These are written into the root zarr-layer metadata key and are optional — omitting layer_hints has no effect on the pyramid structure or encoding.
Chunking
create_pyramid returns a Pyramid with two attributes: pyramid.dt (the DataTree) and pyramid.encoding (recommended chunk and shard sizes per variable per level). Always pass pyramid.encoding as the encoding argument when writing — this is what applies the chunking strategy to the output store.
# Inspect the recommended encoding before writing
print(pyramid.encoding)
The heuristics target ~500KB chunks for web visualization. You can tune shard size with chunks_per_shard (default: 4, giving 16 chunks per shard and ~8MB shards). Valid values are powers of 2: 1, 2, 4, 8, 16, 32. Larger shards reduce task graph overhead when using Dask but increase memory usage.
chunks_per_shard |
chunks/shard | approx shard size |
|---|---|---|
| 1 | 1 | ~500KB |
| 4 | 16 | ~8MB (default) |
| 8 | 64 | ~32MB |
| 16 | 256 | ~128MB |
Pass chunks_per_shard=None to disable sharding entirely.
Writing
Always pass pyramid.encoding to apply the recommended chunking:
# Write to Zarr
from obstore.store import from_url
from zarr.storage import ObjectStore
store = ObjectStore(from_url(url="<your_bucket_url>", region="<your_region>"))
pyramid.dt.to_zarr(store, mode="w", encoding=pyramid.encoding, zarr_format=3)
# Write to Icechunk
import icechunk
storage = icechunk.s3_storage(bucket="<your_bucket>", prefix="<your_prefix>", from_env=True)
repo = icechunk.Repository.create(storage)
session = repo.writable_session("main")
pyramid.dt.to_zarr(session.store, mode="w", encoding=pyramid.encoding, consolidated=False)
Contributing
Clone the repo and install with the test dependency group:
git clone https://github.com/carbonplan/topozarr
cd topozarr
uv sync --group test
Run tests:
uv run pytest -n auto
Run conformance tests against the GeoZarr spec (requires the conformance group):
uv sync --group conformance
uv run pytest -n auto -m conformance
Lint and format:
uv run pre-commit run --all-files
To regenerate the demo datasets in S3 (requires AWS credentials), install the demo extra and run the build script:
uv sync --extra demo
uv run python scripts/build_demo_data.py --help
License
[!IMPORTANT] This code is licensed under the MIT License - see the LICENSE file for details.
About Us
CarbonPlan is a nonprofit organization that uses data and science for climate action. We aim to improve the transparency and scientific integrity of climate solutions through open data and tools. Find out more at carbonplan.org or get in touch by opening an issue or sending us an email
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file topozarr-0.0.6.tar.gz.
File metadata
- Download URL: topozarr-0.0.6.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8fc6728c7796da9c71a68f6026a78743dcedd6296e400206264487b5abf37dfe
|
|
| MD5 |
d0e30b898f61775ba7c30a2db951e6ea
|
|
| BLAKE2b-256 |
51ca583de433bc37128e431aba336a09642a9f6834988cd2021ca81216b3d63a
|
File details
Details for the file topozarr-0.0.6-py3-none-any.whl.
File metadata
- Download URL: topozarr-0.0.6-py3-none-any.whl
- Upload date:
- Size: 8.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76c58b784d1272ecaf0b73a649f186c62799e8176b7a6d5a3d5fb834d333ad15
|
|
| MD5 |
0987c045e526622be5b399d1316b186e
|
|
| BLAKE2b-256 |
40b5628461bf9f08944dd4f5f8204de4cb798e34c4affc6bd05c2c3d0e0f0ff0
|