Skip to main content

NOAA High-Resolution Rapid Refresh (HRRR) stactools package

Project description

stactools-noaa-hrrr

PyPI GitHub Workflow Status (with event)

wind speed forecast from 2024-05-10T12:00:00Z for 2024-05-10T14:00:00Z

This package can be used to generate STAC metadata for the NOAA High Resolution Rapid Refresh (HRRR) atmospheric forecast dataset.

The data are uploaded to cloud storage in AWS, Azure, and Google so you can pick which cloud provider you want to use for the grib and index hrefs using the cloud_provider argument to the functions in stactools.noaa_hrrr.stac.

Background

The NOAA HRRR dataset is a continuously updated atmospheric forecast data product.

Data structure

  • There are two regions: CONUS and Alaska
  • Every hour, new hourly forecasts are generated for many atmospheric attributes for each region
    • All hours (00-23) get an 18-hour forecast in the conus region
    • Forecasts are generated every three hours (00, 03, 06, etc) in the alaska region
    • On hours 00, 06, 12, 18 a 48-hour forecast is generated
    • One of the products (subh) gets 15 minute forecasts (four per hour per attribute), but the sub-hourly forecasts are stored as layers within a single GRIB2 file for the forecast hour rather than in separate files.
  • The forecasts are broken up into 4 products (sfc, prs, nat, subh),
  • Each GRIB2 file has hundreds to thousands of variables
  • Each .grib2 file is accompanied by a .grib2.idx which has variable-level metadata including the starting byte for the data in that variable (useful for making range requests instead of reading the entire file) and some other descriptive metadata

Summary of Considerations for Organizing STAC Metadata

After extensive discussions, we decided to organize the STAC metadata with the following structure:

  1. Collections: Separate collections for each region-product combination

    • regions: conus and alaska
    • products: sfc, prs, nat, and subh
  2. Items: Each GRIB file in the archive is represented as an item with two assets:

    • "grib": Contains the actual data.
    • "index": The .grib2.idx sidecar file.

    Each GRIB file contains the forecasts for all of a product's variables for a particular forecast hour from a reference time, so you need to combine data from multiple items to construct a time series for a forecast.

  3. grib:layers: Within each "grib" asset, a grib:layers property details each layer's information, including description, units, and byte ranges. This enables applications to access specific parts of the GRIB2 files without downloading the entire file.

    • We intend to propose a GRIB STAC extension with the grib:layers property for storing byte-ranges after testing this specification out on other GRIB2 datasets.
    • The layer-level metadata is worth storing in STAC because you can construct URIs for specific layers that GDAL can read using either /vsisubfile or vrt://:
      • /vsisubfile/{start_byte}_{byte_size},/vsicurl/{grib_href}
      • vrt:///vsicurl/{grib_href}?bands={grib_message}, where grib_message is the index of the layer within the GRIB2 file.
        • under the hood, GDAL's vrt driver is reading the sidecar .grib2.idx file and translating it into a /vsisubfile URI.

Advantages

  • Applications can use grib:layers to create layer-specific data sets, facilitating efficient data handling.
  • Splitting by region and product allows defining coherent collection-level datacube metadata, enhancing accessibility.

Disadvantages

  • Storing layer-level metadata like byte ranges in the STAC metadata bloats the STAC items because there are hundreds to thousands of layers in each GRIB2 file.

For more details, please refer to the related issue discussion and pull requests #3 and #6.

STAC examples

Python usage example

  • Check out the example notebook for examples of how to create STAC metadata and how to use STAC items with grib:layers metadata to load the data into xarray.

Installation

Install stactools-noaa-hrrr with pip:

pip install stactools-noaa-hrrr

Command-line usage

To create a collection object:

stac noaahrrr create-collection {region} {product} {cloud_provider} {destination_file}

e.g.

stac noaahrrr create-collection conus sfc azure example-collection.json

To create an item:

stac noaahrrr create-item \
  {region} \
  {product} \
  {cloud_provider} \
  {reference_datetime} \
  {forecast_hour} \
  {destination_file}

e.g.

stac noaahrrr create-item conus sfc azure 2024-05-01T12 10 example-item.json

To create all items for a date range:

stac noaahrrr create-item-collection \
  {region} \
  {product} \
  {cloud_provider} \
  {start_date} \
  {end_date} \
  {destination_folder}

e.g.

stac noaahrrr create-item-collection conus sfc azure 2024-05-01 2024-05-31 /tmp/items

Docker

You can launch a jupyterhub server in a docker container with all of the dependencies installed using these commands:

docker/build
docker/jupyter

Use stac noaahrrr --help to see all subcommands and options.

Contributing

We use pre-commit to check any changes. To set up your development environment:

pip install -e '.[dev]'
pre-commit install

To check all files:

pre-commit run --all-files

To run the tests:

pytest -vv

If you've updated the STAC metadata output, update the examples:

scripts/update-examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stactools_noaa_hrrr-0.1.2.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

stactools_noaa_hrrr-0.1.2-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file stactools_noaa_hrrr-0.1.2.tar.gz.

File metadata

  • Download URL: stactools_noaa_hrrr-0.1.2.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for stactools_noaa_hrrr-0.1.2.tar.gz
Algorithm Hash digest
SHA256 5f952669b49c8417926805ac1777eb4f03c1deb6ab5939952d7373553eb6eb56
MD5 3eb0e6e44f68ba3d07e0293b8f343d95
BLAKE2b-256 35b73708fd713836dbbeb587eb62c4006611396e20b070e9a03619edf2445e32

See more details on using hashes here.

File details

Details for the file stactools_noaa_hrrr-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for stactools_noaa_hrrr-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 07f87845d300f85d56e092fb2b054103ed8fa690866da75205effc70f77ea3b5
MD5 4490815d255627ccf6cc6732e5101ecd
BLAKE2b-256 5c3fc5f990d02b9fa5e71354a37a1ea72c2e1b30fa9ebbba1036c1762f374165

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page