Skip to main content

Lightweight multiscale zarr

Project description

topozarr - lightweight multiscale zarr pyramids

Python library to create multiscale Zarr pyramids for usage with zarr-layer.

Attempts to follow the GeoZarr spec.

  • multiscales — pyramid structure and resolution levels
  • proj: — coordinate reference system (CRS)
  • spatial: — affine transform, bounding box, and dimension names

Warning: experimental

Usage

Installation

You can install the tutorial optional dependency group to run this example.

uv add 'topozarr[tutorial]'
# or
pip install 'topozarr[tutorial]'

Example

import xarray as xr
import xproj # for crs assignment
from topozarr.coarsen import create_pyramid

# Load the air_temperature Xarray tutorial dataset
ds = xr.tutorial.open_dataset('air_temperature', chunks="auto")
ds = ds.proj.assign_crs(spatial_ref="EPSG:4326")
print(ds)
pyramid = create_pyramid(
    ds,
    levels=2,
    x_dim="lon",
    y_dim="lat",
    method="mean",  # "mean" (default) | "max" | "min" | "sum"
)
print(pyramid.encoding)
print(pyramid.dt)

Chunking

A recommended encoding is returned with the pyramid. You can access it with .pyramid.encoding. There are some basic heuristics to try to get chunk sizes of ~500KB for web visualization and shard sizes 4 times the size (configurable). You can tune the size of the shards with the chunks_per_shard parameter (default: 4, giving 16 chunks per shard and ~8MB shards). Valid values are powers of 2: 1, 2, 4, 8, 16, 32. Larger shards increase memory usage, but decrease the task graph overhead if using Dask.

chunks_per_shard chunks/shard approx shard size
1 1 ~500KB
4 16 ~8MB (default)
8 64 ~32MB
16 256 ~128MB
pyramid = create_pyramid(ds, levels=8, x_dim="lon", y_dim="lat")

Pass chunks_per_shard=None to disable sharding entirely.

# Optional: Write to Zarr
# !pip install obstore zarr
from obstore.store import from_url
from zarr.storage import ObjectStore


store = from_url(url = "<add_your_bucket_url>", region="<add_your_region>")
zstore = ObjectStore(store) 
pyramid.dt.to_zarr(zstore, mode="w", encoding = pyramid.encoding, zarr_format=3)
# Optional: Write to Icechunk
# !pip install icechunk 
import icechunk

storage = icechunk.s3_storage(bucket="<add_your_bucket_name>", prefix="<add_your_prefix>", from_env=True)
repo = icechunk.Repository.create(storage)
session = repo.writable_session("main")

store = from_url(url = "<add_your_bucket_url>", region="<add_your_region>")
zstore = ObjectStore(store) 
pyramid.dt.to_zarr(session.store, mode="w", encoding = pyramid.encoding, consolidated=False)

Development

This project uses uv for dependency management, pytest and hypothesis for testing and ruff for linting.

Sync development environment

uv sync --all-extras

Run linter

uv run pre-commit run all-files

Run tests

uv run pytest tests/

Run conformance tests - test against geozarr spec using geozarr-toolkit (requires geozarr-toolkit)

uv sync --group conformance
uv run pytest tests/ -m conformance

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

topozarr-0.0.5.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

topozarr-0.0.5-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file topozarr-0.0.5.tar.gz.

File metadata

  • Download URL: topozarr-0.0.5.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for topozarr-0.0.5.tar.gz
Algorithm Hash digest
SHA256 4adf766c0f5de9a836d4e09f424928507f500b7b84089b4d35b17cba75014908
MD5 e8bc4c7b46dc799f66ab736f2608de2a
BLAKE2b-256 4950173a693d60c85bf51dcfeee5dab92539091506127568f96729cd60955f06

See more details on using hashes here.

File details

Details for the file topozarr-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: topozarr-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for topozarr-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c13b5ac57b7dc6ae56b7d7ee07af0b8466ae46b012b6cd809b6baf6c50d89194
MD5 10fd969d5c08fefb3985c0116635b6cb
BLAKE2b-256 d811f9fdd61fb9e8c61d1012e28a46f8878778376fcb2e2c93451c404d9ea362

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page