NOAA High-Resolution Rapid Refresh (HRRR) stactools package
Project description
stactools-noaa-hrrr
- Name: noaa-hrrr
- Package:
stactools.noaa_hrrr - stactools-noaa-hrrr on PyPI
- Owner: @hrodmn
- Dataset homepage
- STAC extensions used:
- Extra fields:
noaa-hrrr:forecast_cycle_type: either standard (18-hour) or extended (48-hour)noaa-hrrr:region: eitherconusoralaska
- Browse the example in human-readable form
- Browse a notebook demonstrating the example item and collection
This package can be used to generate STAC metadata for the NOAA High Resolution Rapid Refresh (HRRR) atmospheric forecast dataset.
The data are uploaded to cloud storage in AWS, Azure, and Google so you can pick
which cloud provider you want to use for the grib and index hrefs using the
cloud_provider argument to the functions in stactools.noaa_hrrr.stac.
Background
The NOAA HRRR dataset is a continuously updated atmospheric forecast data product.
Data structure
- There are two regions: CONUS and Alaska
- Every hour, new hourly forecasts are generated for many atmospheric attributes
for each region
- All hours (
00-23) get an 18-hour forecast in theconusregion - Forecasts are generated every three hours (
00,03,06, etc) in thealaskaregion - On hours
00,06,12,18a 48-hour forecast is generated - One of the products (
subh) gets 15 minute forecasts (four per hour per attribute), but the sub-hourly forecasts are stored as layers within a single GRIB2 file for the forecast hour rather than in separate files.
- All hours (
- The forecasts are broken up into 4 products (
sfc,prs,nat,subh), - Each GRIB2 file has hundreds to thousands of variables
- Each .grib2 file is accompanied by a .grib2.idx which has variable-level metadata including the starting byte for the data in that variable (useful for making range requests instead of reading the entire file) and some other descriptive metadata
Summary of Considerations for Organizing STAC Metadata
After extensive discussions, we decided to organize the STAC metadata with the following structure:
-
Collections: Separate collections for each region-product combination
- regions:
conusandalaska - products:
sfc,prs,nat, andsubh
- regions:
-
Items: Each GRIB file in the archive is represented as an item with two assets:
"grib": Contains the actual data."index": The .grib2.idx sidecar file.
Each GRIB file contains the forecasts for all of a product's variables for a particular forecast hour from a reference time, so you need to combine data from multiple items to construct a time series for a forecast.
-
grib:messages: Within each"grib"asset, agrib:messagesproperty describes each layer's GRIB message index within the GRIB2 file. This enables applications to access specific parts of the GRIB2 files without downloading the entire file. -
grib:layer_definitions: This field is meant to be populated at in the item-assets definitions at the collection level. It describes the variable name, units, and other relevant properties for the each layer within the GRIB file.
- We intend to propose a
GRIBSTAC extension with thegrib:messagesandgrib:layer_definitionsproperties for describing the grib message indexes of the variables within the GRIB files. - The layer-level metadata is worth storing in STAC because you can construct
URIs for specific layers that GDAL can read using the VRT URI syntax
vrt://:vrt:///vsicurl/{grib_href}?bands={grib_message}, wheregrib_messageis the index of the layer within the GRIB2 file.- under the hood, GDAL's
vrtdriver is reading the sidecar .grib2.idx file and translating it into a/vsisubfileURI.
- under the hood, GDAL's
Advantages
- Applications can use
grib:layer_definitionsandgrib:messagesto create layer-specific data sets, facilitating efficient data handling. - Splitting by region and product allows defining coherent collection-level datacube metadata, enhancing accessibility.
Disadvantages
- Storing layer-level metadata like byte ranges in the STAC metadata bloats the
STAC items because there are hundreds to thousands of layers in each GRIB2 file.
For more details, please refer to the related issue discussion and pull requests #3 and #6.
STAC examples
Python usage example
- Check out the example notebook for examples of how to
create STAC metadata and how to use STAC items with
grib:layersmetadata to load the data into xarray.
Installation
Install stactools-noaa-hrrr with pip:
pip install stactools-noaa-hrrr
Command-line usage
To create a collection object:
stac noaahrrr create-collection {region} {product} {cloud_provider} {destination_file}
e.g.
stac noaahrrr create-collection conus sfc azure example-collection.json
To create an item:
stac noaahrrr create-item \
{region} \
{product} \
{cloud_provider} \
{reference_datetime} \
{forecast_hour} \
{destination_file}
e.g.
stac noaahrrr create-item conus sfc azure 2024-05-01T12 10 example-item.json
To create all items for a date range:
stac noaahrrr create-item-collection \
{region} \
{product} \
{cloud_provider} \
{start_date} \
{end_date} \
{destination_folder}
e.g.
stac noaahrrr create-item-collection conus sfc azure 2024-05-01 2024-05-31 /tmp/items
Docker
You can launch a jupyterhub server in a docker container with all of the dependencies installed using these commands:
docker/build
docker/jupyter
Use stac noaahrrr --help to see all subcommands and options.
Contributing
We use pre-commit to check any changes. To set up your development environment, install uv:
uv sync
uv run pre-commit install
To check all files:
uv run pre-commit run --all-files
To run the tests:
uv run pytest -vv
If you've updated the STAC metadata output, update the examples:
uv run scripts/update-examples
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stactools_noaa_hrrr-1.0.1.tar.gz.
File metadata
- Download URL: stactools_noaa_hrrr-1.0.1.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
587006bc680b8bc654c0ad18fa6bb94d5321fceea46c519a045e28c9771049f2
|
|
| MD5 |
951401f5eed4c6a2a8f6bf699bcb9292
|
|
| BLAKE2b-256 |
ea9b4097922700e28eb5668744c3b453986d3a1f2105c605e4a451d904fef8ec
|
File details
Details for the file stactools_noaa_hrrr-1.0.1-py3-none-any.whl.
File metadata
- Download URL: stactools_noaa_hrrr-1.0.1-py3-none-any.whl
- Upload date:
- Size: 1.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d96be89eba7d76b525a28b9e957a8b9a164fa5be6ff138495c528a6d2aea969c
|
|
| MD5 |
ab57f6efe59a78a831bf55131d176f2f
|
|
| BLAKE2b-256 |
d8ae344ad8fcdd45962d6cf54298539fbd5f0a26111d690549dacf73b1d10bd7
|