Skip to main content

Lazily load COG assets from STAC items into xarray DataArrays using async-geotiff

Project description

lazycogs

CI PyPI Python Versions License

Open a lazy (band, time, y, x) xarray DataArray from thousands of cloud-optimized geotiffs (COGs). No GDAL required.

What is lazycogs?

stackstac and odc-stac established the pattern that lazycogs builds on: take a STAC item collection and expose it as a spatially-aligned xarray DataArray ready for dask-parallel computation. Both are excellent tools that cover most satellite imagery workflows well. They rely on the trusty combination of rasterio and GDAL for data i/o and warping operations.

lazycogs takes the same approach but replaces GDAL and rasterio with a Rust-native stack: rustac for STAC queries over stac-geoparquet files, async-geotiff for COG i/o, and obstore for cloud storage access.

The result is a tool that can instantly expose a lazy xarray DataArray view of massive STAC item archives in any CRS and resolution. Each array operation triggers a targeted spatial query on the stac-geoparquet file to find only the assets needed for that specific chunk — no upfront scan of every item required.

One constraint worth naming: lazycogs only reads Cloud Optimized GeoTIFFs. If your assets are in another format, this is not the right tool.

Task Library
STAC search + spatial indexing rustac (DuckDB + geoparquet)
COG I/O async-geotiff (Rust, no GDAL)
Cloud storage obstore
Reprojection pyproj + numpy
Lazy dataset construction xarray BackendEntrypoint + LazilyIndexedArray

Installation

Not yet published to PyPI. Install directly from GitHub:

pip install lazycogs

Coordinate convention

lazycogs.open() returns a DataArray whose y coordinates follow the standard north-up raster convention with the origin in the top left (not bottom left). That is, y coordinates are descending from north to south. In other words, y label 0 is the northernmost pixel and y[-1] is the southernmost. This matches the affine transform and is consistent with odc-stac, rioxarray, and GDAL.

Use sel(y=slice(north, south)) (high to low) for spatial subsetting.

Example

import rustac
import lazycogs
from pyproj import Transformer

# set a target CRS and extent
dst_crs = "EPSG:32615"
dst_bbox = (380000.0, 4928000.0, 420000.0, 4984000.0)

# transform to 4326 for STAC search
transformer = Transformer.from_crs(dst_crs, "epsg:4326", always_xy=True)
bbox_4326 = transformer.transform_bounds(*dst_bbox)

# Search a STAC API and cache results to a local stac-geoparquet file.
await rustac.search_to(
    "items.parquet",
    "https://earth-search.aws.element84.com/v1",
    collections=["sentinel-2-l2a"],
    datetime="2023-06-01/2023-08-31",
    bbox=bbox_4326,
)

# Open a fully lazy (band, time, y, x) DataArray. No COGs are read yet.
da = lazycogs.open(
    "items.parquet",
    bbox=dst_bbox,
    crs=dst_crs,
    resolution=10.0,
)

Async loading

When you are already inside an async context (for example, a Jupyter notebook running on an asyncio loop), you can trigger chunk reads without blocking the event loop:

# Fetch data asynchronously and load into memory in-place.
subset = await da.isel(x=slice(0, 10), y=slice(0, 10), time=slice(0, 10).load_async()

load_async uses xarray's async protocol, which dispatches through MultiBandStacBackendArray.async_getitem and stays on the caller's event loop. Multiple concurrent chunk reads overlap naturally, so the async path can be faster than the synchronous da.compute() when reading many chunks inside an already-running loop.

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazycogs-0.3.0.tar.gz (41.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lazycogs-0.3.0-py3-none-any.whl (46.9 kB view details)

Uploaded Python 3

File details

Details for the file lazycogs-0.3.0.tar.gz.

File metadata

  • Download URL: lazycogs-0.3.0.tar.gz
  • Upload date:
  • Size: 41.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lazycogs-0.3.0.tar.gz
Algorithm Hash digest
SHA256 ae1f9ec8edec381773568227796c038a11e5d0c773aebe42e156a58c454f8296
MD5 9c0de52a1e02de9de82ecc9f3f87aec3
BLAKE2b-256 1ecbb4f6e7a67c9d9c6349c7c9ef06ec8039e826fe29367d1cee72ad2b6c63c6

See more details on using hashes here.

File details

Details for the file lazycogs-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: lazycogs-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 46.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lazycogs-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4d77985fafe7a1e702ce3924b25cf6390f8239204ffc53addbbda14608af06d0
MD5 3dfb7079af6540906d435f21c7ba3049
BLAKE2b-256 a6d5d6ade7d443982ae8714acaa0cfd13f242ab21fa5dbab0dd54734feade157

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page