Skip to main content

Extend xarray.open_dataset to accept pystac objects

Project description

xpystac

xpystac provides the glue that allows xarray.open_dataset to accept pystac objects.

The goal is that as long as this library is in your env, you should never need to think about it.

  • Open one asset: Reads data for an asset pointing to a COG, a zarr store, or a kerchunk reference file.
  • Open one item: Reads data for all the assets in a particular item (commonly each COG represents a band).
  • Open many items: Reads all the assets in all the items for a particular item collection iterable of items, or output of pystac_client.Client.search.

What works

file format one asset (item or collection-level) one item many items
COG x x x
zarr x
kerchunk x x* x*

* if stored in item alongside the datacube extension properties

Install

pip install xpystac

Examples

Open a single asset

Read from a COG

import pystac
import xarray as xr

item = pystac.Item.from_file(
    "https://raw.githubusercontent.com/stac-utils/pystac/v1.12.2/tests/data-files/examples/1.0.0/simple-item.json"
)
asset = item.assets["visual"]

xr.open_dataset(asset)

Here are a few examples from the Planetary Computer Docs which has some good examples of collection-level assets used to catalog zarr stores and kerchunk reference files.

import planetary_computer
import pystac_client
import xarray as xr


catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=planetary_computer.sign_inplace,
)

Read from a kerchunk reference file (ref):

collection = catalog.get_collection("nasa-nex-gddp-cmip6")
asset = collection.assets["ACCESS-CM2.historical"]

xr.open_dataset(asset, patch_url=planetary_computer.sign)

Read from a zarr file (ref)

collection = catalog.get_collection("daymet-daily-hi")
asset = collection.assets["zarr-abfs"]

xr.open_dataset(asset, patch_url=planetary_computer.sign)

Note that this zarr asset uses the xarray-assets extension to store open_kwargs and storage_options which xpystac can then pass along to xr.open_dataset.

Open a single item

A single item containing many COGs:

import pystac
import xarray as xr


item = pystac.Item.from_file(
    "https://earth-search.aws.element84.com/v1/collections/landsat-c2-l2/items/LC09_L2SR_081108_20250311_02_T2"
)

xr.open_dataset(item)

This takes advantage of a stacking library (either odc-stac or stackstac - configurable via the stacking_library option)

Open many items

Read all the data from the search results for a collection of COGs:

import pystac_client
import xarray as xr


catalog = pystac_client.Client.open(
    "https://earth-search.aws.element84.com/v1",
)

search = catalog.search(
    intersects=dict(type="Point", coordinates=[-105.78, 35.79]),
    collections=['sentinel-2-l2a'],
    datetime="2022-04-01/2022-05-01",
)

xr.open_dataset(search, engine="stac")

Read data from an item collection that uses the exploratory approach of storing kerchunked metadata within the datacube extension metadata:

import pystac
import xarray as xr

item_collection = pystac.ItemCollection.from_file(
    "https://raw.githubusercontent.com/stac-utils/xpystac/main/tests/data/data-cube-kerchunk-item-collection.json"
)

xr.open_dataset(item_collection)

How it works

When you call xarray.open_dataset(object, engine="stac") this library maps that open call to the correct library. Depending on the type of object that might be a stacking library (either odc-stac or stackstac) or back to xarray.open_dataset itself but with the engine and other options pulled from the pystac object.

Prior Art

This work is inspired by https://github.com/TomAugspurger/staccontainers and the discussion in https://github.com/stac-utils/pystac/issues/846

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xpystac-0.3.0.tar.gz (118.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xpystac-0.3.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file xpystac-0.3.0.tar.gz.

File metadata

  • Download URL: xpystac-0.3.0.tar.gz
  • Upload date:
  • Size: 118.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for xpystac-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e2d75483b10fbd6de32fa0c242633384151c9615630e182c390ca7cb063e2922
MD5 131e29ed740682ac05549b78413514b7
BLAKE2b-256 e5c594a094c0f5c425e0cd036b8534a545e803eb3c7a4d4dc516bd5ef83d74eb

See more details on using hashes here.

File details

Details for the file xpystac-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: xpystac-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for xpystac-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7bada281cdfe1a005be129e96096994de1134bbfc9bd4cf2670401750487508a
MD5 8e9bfeb5a996dc1bde0c94f2e420db7d
BLAKE2b-256 d0620facd92075a4ed6207a1fe8f3b035067d5998c32397d1bec1e06aa4a059a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page