Skip to main content

A (git-)versionned STAC catalog storage and management system

Project description

STAC Repository

FastAPI

A (git-)versionned STAC catalog storage and management system.

Project in late stage development phase.

Introduction and Features

stac-repository is a storage system and command-line interface for managing STAC catalogs. It implements advanced features necessary to build and maintain a complex STAC catalog :

  • Automated ingestions (Non-STAC)

    Automated ingestion of non-STAC products via custom stac-processors modules. These are designed for ease of implementation via the StacProcessor Protocol. To view installed processors, run stac-repository show-processors.

  • Automated Ingestions (STAC)

    Automated ingestion of STAC items and collections using a built-in stac-processor.

  • Backend Support

    Support for multiple storage backends, including built-in Git+LFS "git" and local filesystem "file". To view installed backends, run stac-repository show-backends. The architecture is also designed to facilitate the development of additional backends (e.g. FTP, NoSQL databases).

  • Transactional Operations

    Transactional ingestions, updates, and deletions to ensure catalog integrity and atomicity

  • Immutable History

    Immutable history of all transactions (note: this feature is not supported by the local filesystem backend).

  • Backup and Rollback

    Backup and rollback of the catalog at any point in history.

  • Export

    Export command to extract the catalog to the local filesystem independently of the underlying storage backend.

These capabilities make stac-repository a powerful tool for robust STAC catalog management.

Installation

stac-repository is available directly on pypi.

pip install stac-repository

This installs two commands : stac-repository, and stac-processor to try out a processor without ingesting.

Usage

stac-repository --help
Usage: stac_repository_cli [OPTIONS] COMMAND [ARGS]...

 ๐ŸŒ๐Ÿ›ฐ๏ธ     STAC Repository

 The interface to manage STAC catalogs.

โ•ญโ”€ Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ --repository                TEXT  Repository URI - interpreted by the chosen backend. [default: None]                                                                                  โ”‚
โ”‚ --backend                   TEXT  Backend. [default: file]                                                                                                                             โ”‚
โ”‚ --install-completion              Install completion for the current shell.                                                                                                            โ”‚
โ”‚ --show-completion                 Show completion for the current shell, to copy it or customize the installation.                                                                     โ”‚
โ”‚ --help                            Show this message and exit.                                                                                                                          โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ•ญโ”€ Commands โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ version           Shows stac-repository version number.                                                                                                                                โ”‚
โ”‚ show-backends     Shows installed stac-repository backends.                                                                                                                            โ”‚
โ”‚ show-processors   Shows installed stac-repository processors.                                                                                                                          โ”‚
โ”‚ init              Initializes the repository.                                                                                                                                          โ”‚
โ”‚ config            Get or set the repository configuration options - interpreted by the chosen backend.                                                                                 โ”‚
โ”‚ ingest            Ingests some products from various sources (eventually using an installed processor).                                                                                โ”‚
โ”‚ prune             Removes some products from the catalog.                                                                                                                              โ”‚
โ”‚ history           Logs the catalog history.                                                                                                                                            โ”‚
โ”‚ rollback          Rollbacks the catalog to a previous commit. Support depends on the chosen backend.                                                                                   โ”‚
โ”‚ export            Exports the catalog. If a commit ref is specified, exports the catalog as it was at that point in time.                                                              โ”‚
โ”‚ backup            Backups the repository. If a commit ref is specified, backups the repository only up to this point in time.                                                          โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Let's initialize a new repository (using the default "file" backend - implicit --backend file)

stac-repository --repository test_repository init

Since we didn't specify --root-catalog we are prompted for some basic information :

Initialize from an existing root catalog file ? (Leave blank to use the interactive initializer):
id (root): test_catalog
title: A Simple Demo Catalog
description: This is a simple demo catalog.
license (proprietary):
{
    'id': 'test_catalog',
    'description': 'This is a simple demo catalog.',
    'stac_version': '1.0.0',
    'links': [],
    'title': 'A Simple Demo Catalog',
    'type': 'Catalog',
    'license': 'proprietary'
}
Use as root catalog ? [y/n] (n): y

The newly created catalog :

test_repository/
test_repository/catalog.json

Let's ingest some STAC product using the default processor (implicit --processor stac)

stac-repository --repository test_repository ingest ~/test_catalogs/thermavolc/
 โ€ข ~/test_catalogs/thermavolc/ : Discovered products ~/test_catalogs/thermavolc/collection.json
 โ€ข ~/test_catalogs/thermavolc/collection.json : Cataloged

Our catalog now looks like this :

test_repository/
test_repository/orthophotos-Summit-20221116
test_repository/orthophotos-Summit-20221116/collection.json
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Visible
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Visible/20221116_Summit_Visible_orthophoto.png
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Visible/20221116_Summit_Visible_orthophoto.tif.aux.xml
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Visible/20221116_Summit_Visible_orthophoto.tif
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Visible/orthophoto-Summit-20221116-Visible.json
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM/20221116_Summit_DTM.tif.aux.xml
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM/orthophoto-Summit-20221116-DTM.json
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM/20221116_Summit_DTM.png
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM/20221116_Summit_DTM.tif
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal/20221116_Summit_Thermal_orthophoto.png
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal/20221116_Summit_Thermal_orthophoto.tif
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal/orthophoto-Summit-20221116-Thermal.json
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal/20221116_Summit_Thermal_orthophoto.tif.aux.xml
test_repository/catalog.json

Finally let's delete some product

stac-repository --repository test_repository prune orthophoto-Summit-20221116-Visible
 โ€ข orthophoto-Summit-20221116-Visible : Uncataloged

Finally our demo catalog looks like this :

test_repository/
test_repository/orthophotos-Summit-20221116
test_repository/orthophotos-Summit-20221116/collection.json
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM/20221116_Summit_DTM.tif.aux.xml
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM/orthophoto-Summit-20221116-DTM.json
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM/20221116_Summit_DTM.png
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-DTM/20221116_Summit_DTM.tif
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal/20221116_Summit_Thermal_orthophoto.png
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal/20221116_Summit_Thermal_orthophoto.tif
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal/orthophoto-Summit-20221116-Thermal.json
test_repository/orthophotos-Summit-20221116/orthophoto-Summit-20221116-Thermal/20221116_Summit_Thermal_orthophoto.tif.aux.xml
test_repository/catalog.json

Motivation - Why this Project ?

Full motivation is detailed in motivation.md.

Presenting scientific data products as STAC catalogs is primarily motivated by the objective of achieving data FAIR-ness (Findable, Accessible, Interoperable, Reusable). The mature STAC ecosystem makes building such a catalog relatively straightforward for simple cases (e.g. pystac | Creating a Landsat 8 STAC).

However, building and maintaining a complex STAC catalog - one subject to incremental changes over an extended period and encompassing diverse product types (e.g., satellite scenes, InSAR interferograms, InSAR time series) - introduces significant challenges. Effective maintenance necessitates capabilities for data rollback, backup, and exploring historical changes. And routine data ingestion requires automation of product conversion and ingestion, which itself requires transactional operations to ensure data integrity.

It is precisely to address these complex catalog management challenges that stac-repository was developed. It provides a robust storage system and CLI that integrates with and abstracts away the complexities of the underlying chosen backend (like Git+LFS). This approach allows stac-repository to offer transactional integrity, immutable history, backup/rollback capabilities, by treating the STAC catalog as a versioned data product, without requiring users to directly interact with the underlying backend.

While stac-repository greatly simplifies complex STAC catalog management, the underlying architecture introduces limitations. The Git+LFS backend, for instance, provides strong versioning capabilities but introduces a dependency on Git and Git LFS, which may require some foundational understanding for advanced operations or troubleshooting. For extremely large catalogs with millions of items or very high update frequencies, performance characteristics of the current backends will not be enough. While the local filesystem backend simplifies setup, it foregoes the immutable history provided by Git-based backends.

The Processor Protocol

A processor is a python module implementing the processor protocol described processor.py.

An example can be found in stac-processor.py

Source & Contributing

just --list

See the Justfile.

History

stac-repository is being actively developped at the OPGC an observatory for the sciences of the universe (OSU) belonging to the CNRS and the UCA by its main author Pierre Fontbonne @fntb.

License

OPEN LICENCE 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stac_repository-0.0.6.tar.gz (43.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stac_repository-0.0.6-py3-none-any.whl (49.1 kB view details)

Uploaded Python 3

File details

Details for the file stac_repository-0.0.6.tar.gz.

File metadata

  • Download URL: stac_repository-0.0.6.tar.gz
  • Upload date:
  • Size: 43.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.6

File hashes

Hashes for stac_repository-0.0.6.tar.gz
Algorithm Hash digest
SHA256 9b80629bc65daf3e80df8ab54d450101951e22a42e9a02ee9207b4374e6319bc
MD5 d91641b25773dfc34fd37777754f96a6
BLAKE2b-256 c0bcfa4d88d82cf581505323f6d5e6674786e58ca77bf796acc32e68a2fd38b9

See more details on using hashes here.

File details

Details for the file stac_repository-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for stac_repository-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 6d5a1079a24e76378c87fe2f8241f5c3ec8f316b0f896003b7cfea4579b7cf93
MD5 007a7799eb4f99166146eefafdb3742a
BLAKE2b-256 7748199e9e3e45ad3724c34aeaaa8154668af27872d2f1abc876918b074f4429

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page