Skip to main content

No project description provided

Project description

STAC-GeoParquet

Convert STAC items between JSON, GeoParquet, pgstac, and Delta Lake.

Purpose

The STAC spec defines a JSON-based schema. But it can be hard to manage and search through many millions of STAC items in JSON format. For one, JSON is very large on disk. And you need to parse the entire JSON data into memory to extract just a small piece of information, say the datetime and one asset of an Item.

GeoParquet can be a good complement to JSON for many bulk-access and analytic use cases. While STAC Items are commonly distributed as individual JSON files on object storage or through a STAC API, STAC GeoParquet allows users to access a large number of STAC items in bulk without making repeated HTTP requests.

For analytic questions like "find the items in the Sentinel-2 collection in June 2024 over New York City with cloud cover of less than 20%" it can be much, much faster to find the relevant data from a GeoParquet source than from JSON, because GeoParquet needs to load only the relevant columns for that query, not the full data.

See the STAC-GeoParquet specification for details on the exact schema of the written Parquet files.

Installation

Install via pip or conda:

  • pip install stac-geoparquet
  • conda install conda-forge::stac-geoparquet

Documentation

Documentation website

Development

Get uv, then:

git clone git@github.com:stac-utils/stac-geoparquet.git
cd stac-geoparquet
uv sync
uv run pre-commit install
uv run pytest
scripts/lint

Validate the example collection metadata against the jsonschema:

check-jsonschema --schemafile spec/json-schema/metadata.json spec/example-metadata.json

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stac_geoparquet-0.7.0.tar.gz (2.0 MB view details)

Uploaded Source

Built Distribution

stac_geoparquet-0.7.0-py3-none-any.whl (32.8 kB view details)

Uploaded Python 3

File details

Details for the file stac_geoparquet-0.7.0.tar.gz.

File metadata

  • Download URL: stac_geoparquet-0.7.0.tar.gz
  • Upload date:
  • Size: 2.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for stac_geoparquet-0.7.0.tar.gz
Algorithm Hash digest
SHA256 e32ab2f02254cf2b07b1a6e59e97c38ba898ed778953c633fc0816a00323b863
MD5 36082e25a990bed1b8aa30052ff72239
BLAKE2b-256 cd526f97a1f878b1118ab6a96320b0547fd620c77bc25394c0ccf5f24c003f3d

See more details on using hashes here.

Provenance

The following attestation bundles were made for stac_geoparquet-0.7.0.tar.gz:

Publisher: publish.yml on stac-utils/stac-geoparquet

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file stac_geoparquet-0.7.0-py3-none-any.whl.

File metadata

File hashes

Hashes for stac_geoparquet-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d686209af4a4d5ea06b5db5a8aca8a974b9b1a6b95acc6402d71b71e9ea579d5
MD5 cb0e9df8224d4760c21fe4663ae3f12e
BLAKE2b-256 68f2698bce267e4b2de03f9abccc416e34b78e1577c1963557ed14bdd745f595

See more details on using hashes here.

Provenance

The following attestation bundles were made for stac_geoparquet-0.7.0-py3-none-any.whl:

Publisher: publish.yml on stac-utils/stac-geoparquet

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page