Skip to main content

Arrow-native STAC asset lock packaging

Project description

stacpkg

Reproducible STAC packages for handoff, audit, verification, and relocation.

stacpkg turns selected STAC Items into a compact package that another environment can inspect, validate, share, and move. It is for the moment after a STAC search, when "these are the items" needs to become a durable artifact.

package = selected STAC Items + verifiable asset lock + optional content

The package keeps two required tables:

  • items.parquet: the selected STAC Items as a STAC GeoParquet-style table.
  • assets.lock.parquet: one row per locked STAC Asset, with structured location fields and observed object facts such as size, ETag, and last modified time when available.

With that lock in place, you can verify that referenced assets still match, relocate assets into controlled storage, enrich STAC metadata with alternate hrefs, and move the package through an OCI registry.

Install

pip install stacpkg

The quickstart below uses curl against a STAC API. stacpkg can start from STAC JSON, STAC NDJSON, or existing STAC GeoParquet.

Quickstart

Search the OpenAerialMap STAC API for two Austria Items and build a package:

tmpdir=$(mktemp -d "${TMPDIR:-/tmp}/stacpkg-openaerialmap-austria.XXXXXX")
bbox='16.415,47.1705,16.431,47.734'

curl -fsS https://api.imagery.hotosm.org/stac/search \
  --header "Accept: application/geo+json" \
  --header "Content-Type: application/json" \
  --data-binary "{
    \"collections\": [\"openaerialmap\"],
    \"bbox\": [$bbox],
    \"sortby\": [{\"field\": \"start_datetime\", \"direction\": \"asc\"}],
    \"limit\": 2
  }" \
  | stacpkg items from-json \
  | stacpkg build "$tmpdir/openaerialmap-austria.pkg"

echo "created $tmpdir/openaerialmap-austria.pkg"

This curl example keeps the request intentionally small and does not page through all matches. For larger catalogs, use a scalable STAC client such as rustac to stream newline-delimited STAC Items into stacpkg items from-ndjson.

Sample output:

created /tmp/stacpkg-openaerialmap-austria.ABC123/openaerialmap-austria.pkg

The package is just files on disk:

/tmp/stacpkg-openaerialmap-austria.ABC123/openaerialmap-austria.pkg/
  items.parquet
  assets.lock.parquet

Inspect it:

stacpkg inspect "$tmpdir/openaerialmap-austria.pkg" --format markdown

Sample output:

# stacpkg Inspect

- Package: `/tmp/stacpkg-openaerialmap-austria.ABC123/openaerialmap-austria.pkg`
- Items: 2
- Collections: openaerialmap
- Assets: 4
- Asset keys: thumbnail, visual
- Known asset bytes: 16750068

Verify Assets

Validate the package asset lock against the current live objects:

stacpkg asset-lock from-parquet "$tmpdir/openaerialmap-austria.pkg/assets.lock.parquet" \
  | stacpkg asset-lock validate

Sample output:

{"asset_key":"thumbnail","errors":[],"item_id":"631ee6653cdf1c0006b63c5b","store_type":"https","valid":true}
{"asset_key":"visual","errors":[],"item_id":"631ee6653cdf1c0006b63c5b","store_type":"https","valid":true}

Validation prints JSON lines and exits non-zero when an asset no longer matches the locked facts.

Relocate Assets

Copy locked assets into storage you control and write a new asset lock for the relocated locations:

mkdir -p "$tmpdir/local-assets"

stacpkg asset-lock from-parquet "$tmpdir/openaerialmap-austria.pkg/assets.lock.parquet" \
  | stacpkg asset-lock relocate \
      --source-prefix https://oin-hotosm-temp.s3.amazonaws.com/ \
      --store-type file \
      --key "$tmpdir/local-assets/" \
      --max-workers 4 \
      --memory-limit-bytes 512MiB \
      --chunk-size-bytes 8MiB \
  | stacpkg asset-lock to-parquet \
      "$tmpdir/openaerialmap-austria.local.assets.lock.parquet"

echo "created $tmpdir/openaerialmap-austria.local.assets.lock.parquet"

Sample output:

created /tmp/stacpkg-openaerialmap-austria.ABC123/openaerialmap-austria.local.assets.lock.parquet

Validate the relocated files the same way:

stacpkg asset-lock from-parquet "$tmpdir/openaerialmap-austria.local.assets.lock.parquet" \
  | stacpkg asset-lock validate

Common Flows

  • Start from a STAC API search, package selected Items, and keep the exact package inputs.
  • Verify remote assets before a run, handoff, or audit.
  • Relocate referenced assets into S3-compatible, local, or other object-store locations.
  • Enrich STAC Items with File Info and Alternate Assets fields from an asset lock.
  • Push and pull packages through OCI registries.

Docs

Development Commands

Use the repository Makefile as the source of truth for local quality gates:

  • make sync: install all dependency groups.
  • make pre-commit: run formatting, lint, and metadata checks.
  • make test-unit: run fast unit tests.
  • make test-integration: run optional local cross-tool integration tests.
  • make test-e2e: run the CI-sized kind/MinIO/registry e2e suite.
  • make test-e2e-full: run all e2e tests, including performance checks.
  • make test-all: run pre-commit, docs, unit, integration, and full e2e gates.

License

Apache 2.0 (Apache License Version 2.0, January 2004) https://www.apache.org/licenses/LICENSE-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stacpkg-0.1.3.tar.gz (37.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stacpkg-0.1.3-py3-none-any.whl (46.0 kB view details)

Uploaded Python 3

File details

Details for the file stacpkg-0.1.3.tar.gz.

File metadata

  • Download URL: stacpkg-0.1.3.tar.gz
  • Upload date:
  • Size: 37.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stacpkg-0.1.3.tar.gz
Algorithm Hash digest
SHA256 457fb23041d732573c852bd256ecb33d7dd013edabdc41e5a1934bc5aa9247ca
MD5 70454919bc7b3dba21e5cfc5687023fa
BLAKE2b-256 350241ff25d574c9e20f9c5f5f4d89a7aa86ae39df3fbd511e3eabac6e272f51

See more details on using hashes here.

Provenance

The following attestation bundles were made for stacpkg-0.1.3.tar.gz:

Publisher: tag.yaml on versioneer-tech/stacpkg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file stacpkg-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: stacpkg-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 46.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stacpkg-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 bacfbaa8b249c2ebe8f2733b85180093c2ec5086f4cf975ddddbf1636cf17cc9
MD5 23ae1a7f2c8de6471fa51ce2f08ad2f5
BLAKE2b-256 8a0b1ec727e3230cd763e9b5a7b35f0d88b5d4a3dd71afca5cbf156ac7bef5f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for stacpkg-0.1.3-py3-none-any.whl:

Publisher: tag.yaml on versioneer-tech/stacpkg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page