No project description provided
Project description
STAC-GeoParquet
Convert STAC items between JSON, GeoParquet, pgstac, and Delta Lake.
Purpose
The STAC spec defines a JSON-based schema.
But it can be hard to manage and search through many millions of STAC items in JSON format.
For one, JSON is very large on disk.
And you need to parse the entire JSON data into memory to extract just a small piece of information, say the datetime and one asset of an Item.
GeoParquet can be a good complement to JSON for many bulk-access and analytic use cases. While STAC Items are commonly distributed as individual JSON files on object storage or through a STAC API, STAC GeoParquet allows users to access a large number of STAC items in bulk without making repeated HTTP requests.
For analytic questions like "find the items in the Sentinel-2 collection in June 2024 over New York City with cloud cover of less than 20%" it can be much, much faster to find the relevant data from a GeoParquet source than from JSON, because GeoParquet needs to load only the relevant columns for that query, not the full data.
See the STAC-GeoParquet specification for details on the exact schema of the written Parquet files.
Installation
Install via pip or conda:
pip install stac-geoparquetconda install conda-forge::stac-geoparquet
Documentation
Development
Get uv, then:
git clone git@github.com:stac-utils/stac-geoparquet.git
cd stac-geoparquet
uv sync
uv run pre-commit install
uv run pytest
scripts/lint
Validate the example collection metadata against the jsonschema:
check-jsonschema --schemafile spec/json-schema/metadata.json spec/example-metadata.json
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stac_geoparquet-0.7.0.tar.gz.
File metadata
- Download URL: stac_geoparquet-0.7.0.tar.gz
- Upload date:
- Size: 2.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e32ab2f02254cf2b07b1a6e59e97c38ba898ed778953c633fc0816a00323b863
|
|
| MD5 |
36082e25a990bed1b8aa30052ff72239
|
|
| BLAKE2b-256 |
cd526f97a1f878b1118ab6a96320b0547fd620c77bc25394c0ccf5f24c003f3d
|
Provenance
The following attestation bundles were made for stac_geoparquet-0.7.0.tar.gz:
Publisher:
publish.yml on stac-utils/stac-geoparquet
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
stac_geoparquet-0.7.0.tar.gz -
Subject digest:
e32ab2f02254cf2b07b1a6e59e97c38ba898ed778953c633fc0816a00323b863 - Sigstore transparency entry: 213200542
- Sigstore integration time:
-
Permalink:
stac-utils/stac-geoparquet@a59ba7638117fc12c12784dbbe7dbd8c2d583b88 -
Branch / Tag:
refs/tags/0.7.0 - Owner: https://github.com/stac-utils
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a59ba7638117fc12c12784dbbe7dbd8c2d583b88 -
Trigger Event:
release
-
Statement type:
File details
Details for the file stac_geoparquet-0.7.0-py3-none-any.whl.
File metadata
- Download URL: stac_geoparquet-0.7.0-py3-none-any.whl
- Upload date:
- Size: 32.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d686209af4a4d5ea06b5db5a8aca8a974b9b1a6b95acc6402d71b71e9ea579d5
|
|
| MD5 |
cb0e9df8224d4760c21fe4663ae3f12e
|
|
| BLAKE2b-256 |
68f2698bce267e4b2de03f9abccc416e34b78e1577c1963557ed14bdd745f595
|
Provenance
The following attestation bundles were made for stac_geoparquet-0.7.0-py3-none-any.whl:
Publisher:
publish.yml on stac-utils/stac-geoparquet
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
stac_geoparquet-0.7.0-py3-none-any.whl -
Subject digest:
d686209af4a4d5ea06b5db5a8aca8a974b9b1a6b95acc6402d71b71e9ea579d5 - Sigstore transparency entry: 213200546
- Sigstore integration time:
-
Permalink:
stac-utils/stac-geoparquet@a59ba7638117fc12c12784dbbe7dbd8c2d583b88 -
Branch / Tag:
refs/tags/0.7.0 - Owner: https://github.com/stac-utils
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a59ba7638117fc12c12784dbbe7dbd8c2d583b88 -
Trigger Event:
release
-
Statement type: