Skip to main content

Command-line interface (CLI) to rate and crawl STAC

Project description

heystac

GitHub Actions Workflow Status

A command-line utility (CLI) for rating and crawling STAC catalogs. heystac generates the ratings for https://www.gadom.ski/heystac/.

Usage

python -m pip install heystac
heystac --help

To rate a STAC catalog, collection, or item:

$ heystac rate https://landsatlook.usgs.gov/stac-server/collections/landsat-c2l2-st/items/LC09_L2SP_090091_20241118_20241119_02_T2_ST
5.0 ★★★★★

Any issues will be printed to standard output:

$ heystac rate https://landsatlook.usgs.gov/stac-server/collections/landsat-c2l2-st
1.7 ★★

High importance issues
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+
|    Rule id    |                                                                          Message                                                                           |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+
| validate-core | Validation failed for Collection with ID landsat-c2l2-st against schema at https://schemas.stacspec.org/v1.0.0/collection-spec/json-schema/collection.json |
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+

To run json-schema validation on a STAC value:

$ heystac validate https://landsatlook.usgs.gov/stac-server/collections/landsat-c2l2-st 2>&1 | tail -n7
Failed validating 'pattern' in schema['allOf'][0]['properties']['license']:
    {'title': 'Collection License Name',
     'type': 'string',
     'pattern': '^[\\w\\-\\.\\+]+$'}

On instance['license']:
    'https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/atoms/files/Landsat_Data_Policy.pdf'

To crawl a catalog and save the crawl to a directory:

heystac crawl https://landsatlook.usgs.gov/stac-server usgs-landsat

Definitions

We've made some opinionated decisions about behavior in this CLI.

Rate

A Rating is generated by applying a set of Rules to a STAC value. This produces one Check per rule. Each Check has a score between zero and one:

  • 0: the STAC value failed the check
  • 1: the STAC value passed the check
  • Something between 0 and 1: the STAC value partially failed the check, e.g. if the check was for valid links and some links were valid and some were not

Each rule also has an Importance:

  • high
  • medium
  • low

heystac applies a configurable weight to each check based on its importance to produce a score for the STAC value. That score is converted to stars by the following formula: 5 * score / total, where total is the maximum possible score.

Crawl

When heystac crawls a STAC API, it gets every collection and one item from each collection. The catalog is saved to the local filesystem in the following layout:

  • catalog.json
  • collection-a/collection.json
  • collection-a/item-from-collection-a.json
  • collection-b/collection.json
  • collection-b/item-from-collection-b.json

The item file names are generated from the item ID, with all / characters replaced by _.

Configuration

heystac comes with a default configuration that should work for most use-cases. If you want to customize anything, such as the importance weights or the rule descriptions, save the default configuration to a file called heystac.toml:

heystac config > heystac.toml

You can then edit that file to your heart's content. By default, the CLI will read heystac.toml in your current working directory. To specify a config file in another location:

heystac --config a/nother/path/config.toml

License

MIT

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page