Skip to main content

Extend xarray.open_dataset to accept pystac objects

Project description

xpystac

xpystac provides the glue that allows xarray.open_dataset to accept pystac objects.

The goal is that as long as this library is in your env, you should never need to think about it.

  • Open one asset: Reads data for an asset pointing to a COG, a zarr store, or a kerchunk reference file.
  • Open one item: Reads data for all the assets in a particular item (commonly each COG represents a band).
  • Open many items: Reads all the assets in all the items for a particular item collection iterable of items, or output of pystac_client.Client.search.

What works

file format one asset (item or collection-level) one item many items
COG x
Zarr x
Kerchunk x x*
virtual Icechunk x

* if stored in item alongside the datacube extension properties

Install

pip install xpystac

Examples

Open a single asset

Read from a COG

import pystac
import xarray as xr

item = pystac.Item.from_file(
    "https://raw.githubusercontent.com/stac-utils/pystac/v1.12.2/tests/data-files/examples/1.0.0/simple-item.json"
)
asset = item.assets["visual"]

xr.open_dataset(asset)

Read from a virtual Icechunk store

import pystac
import xarray as xr

collection = pystac.Collection.from_file(
    "https://raw.githubusercontent.com/stac-utils/xpystac/refs/heads/main/tests/data/virtual-icechunk-collection.json"
)

# Get the latest version of the collection-level asset
assets = collection.get_assets(role="latest-version")
asset = next(iter(assets.values()))

xr.open_dataset(asset)

Here are a few examples from the Planetary Computer Docs which has some good examples of collection-level assets used to catalog zarr stores and kerchunk reference files.

import planetary_computer
import pystac_client
import xarray as xr


catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=planetary_computer.sign_inplace,
)

Read from a kerchunk reference file (ref):

collection = catalog.get_collection("nasa-nex-gddp-cmip6")
asset = collection.assets["ACCESS-CM2.historical"]

xr.open_dataset(asset, patch_url=planetary_computer.sign)

Read from a zarr file (ref)

collection = catalog.get_collection("daymet-daily-hi")
asset = collection.assets["zarr-abfs"]

xr.open_dataset(asset, patch_url=planetary_computer.sign)

Note that this zarr asset uses the xarray-assets extension to store open_kwargs and storage_options which xpystac can then pass along to xr.open_dataset.

Open many items

Read all the data from the search results for a collection of COGs:

import pystac_client
import xarray as xr


catalog = pystac_client.Client.open(
    "https://earth-search.aws.element84.com/v1",
)

search = catalog.search(
    intersects=dict(type="Point", coordinates=[-105.78, 35.79]),
    collections=['sentinel-2-l2a'],
    datetime="2022-04-01/2022-05-01",
)

xr.open_dataset(search, engine="stac")

Read data from an item collection that uses the exploratory approach of storing kerchunked metadata within the datacube extension metadata:

import pystac
import xarray as xr

item_collection = pystac.ItemCollection.from_file(
    "https://raw.githubusercontent.com/stac-utils/xpystac/main/tests/data/data-cube-kerchunk-item-collection.json"
)

xr.open_dataset(item_collection)

How it works

When you call xarray.open_dataset(object, engine="stac") this library maps that open call to the correct library. Depending on the type of object that might be back to xarray.open_dataset itself but with the engine and other options pulled from the pystac object.

Prior Art

This work is inspired by https://github.com/TomAugspurger/staccontainers and the discussion in https://github.com/stac-utils/pystac/issues/846

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xpystac-0.5.0.tar.gz (150.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xpystac-0.5.0-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file xpystac-0.5.0.tar.gz.

File metadata

  • Download URL: xpystac-0.5.0.tar.gz
  • Upload date:
  • Size: 150.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xpystac-0.5.0.tar.gz
Algorithm Hash digest
SHA256 d146e4e922f70918a7bea322c652b9da719fd6090a35f87aa21e2a83622b7238
MD5 d4c0e0786603e242d7960e284cb149c0
BLAKE2b-256 1bd525b563d527ead74be48dd35702405851aa9de31c9dff4c74694f3a37cecd

See more details on using hashes here.

Provenance

The following attestation bundles were made for xpystac-0.5.0.tar.gz:

Publisher: release.yml on stac-utils/xpystac

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file xpystac-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: xpystac-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xpystac-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4f0ed5994bf0fc87eb1dd6dbb1252103aa34cb1ad58c42e38f3eb214067279fa
MD5 09d899ffa8eb4c7a9a9cb74ff85f42b0
BLAKE2b-256 d29ebb060b9c922e55cd069453401525e1e7b62edf59352c334f1035a906f602

See more details on using hashes here.

Provenance

The following attestation bundles were made for xpystac-0.5.0-py3-none-any.whl:

Publisher: release.yml on stac-utils/xpystac

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page