Create valid STAC Collections, Items and Assets given already existing raster datasets
Project description
Raster-to-STAC
This component allows the creation of STAC Collections with Items and Assets starting from different kinds of raster datasets. It also enables automatic upload of the resulting files to an Amazon S3 Bucket, making them publicly accessible worldwide. The goal is to make datasets easily accessible, interoperable, and shareable.
Approaches
Depending on your requirements, different output formats are available:
1. Via COGs (Cloud Optimized GeoTIFFs)
This approach reads the input dataset and generates multiple Cloud Optimized GeoTIFFs on local disk. This provides high interoperability with third-party libraries for data reading and visualization.
2. Via ZARR
This approach converts data to ZARR format, which is optimized for cloud storage and efficient chunked access to large datasets. The second approach will write a STAC Collection with a single Zarr object.
3. Via netCDF
This approach keeps data in netCDF format while generating STAC metadata for discovery and access.
Installation
Prerequisites
Make sure to manually specify a PEP 440-compliant version in the pyproject.toml file. The recommended format uses Calendar Versioning (CalVer) like 2025.10.1:
sed -i "s/SEMANTIC_VERSION/$VERSION/g" pyproject.toml
Then install the package:
pip install .
Quick Installation
You can also install directly using pip:
pip install raster2stac
Quickstart
This section provides a quick overview of how to use raster2stac to convert your raster data into STAC-compliant assets.
Get Sample Data
You can download sample data to test the package:
wget https://github.com/euracresearch/raster2stac/raw/main/tests/data/S2_L2A_sample.nc
wget https://github.com/Open-EO/openeo-localprocessing-data/raw/main/sample_netcdf/S2_L2A_sample.nc
Usage Examples
Basic Usage
The main class Raster2STAC provides different methods to generate STAC items and collections for various data formats.
Generate COG STAC from netCDF file
Convert a netCDF file to Cloud Optimized GeoTIFFs (COGs) and generate STAC items:
from raster2stac import Raster2STAC
rs2stac = Raster2STAC(
data="S2_L2A_sample.nc", # The netCDF which will be converted into COGs
collection_id="SENTINEL2_L2A_SAMPLE", # The Collection id we want to set
collection_url="https://stac.eurac.edu/collections/", # The URL where the collection will be exposed
output_folder="SENTINEL2_L2A_SAMPLE_STAC_COG"
).generate_cog_stac()
Generate STAC from netCDF file (keep netCDF format)
Generate STAC items while keeping the data in netCDF format:
from raster2stac import Raster2STAC
rs2stac = Raster2STAC(
data="S2_L2A_sample.nc", # The netCDF which will be converted into COGs
collection_id="SENTINEL2_L2A_SAMPLE", # The Collection id we want to set
collection_url="https://stac.eurac.edu/collections/", # The URL where the collection will be exposed
output_folder="SENTINEL2_L2A_SAMPLE_STAC_NETCDF"
).generate_netcdf_stac()
You can then load the STAC item using pystac_client and odc.stac:
import pystac_client
import json
import odc.stac
item_path = "./SENTINEL2_L2A_SAMPLE_STAC/items/20220630000000.json"
stac_api = pystac_client.stac_api_io.StacApiIO()
stac_dict = json.loads(stac_api.read_text(item_path))
item = stac_api.stac_object_from_dict(stac_dict)
ds_stac = odc.stac.load([item])
print(ds_stac)
> <xarray.Dataset> Size: 13MB
> Dimensions: (y: 705, x: 935, time: 1)
> Coordinates:
> * y (y) float64 6kB 5.155e+06 5.155e+06 ... 5.148e+06 5.148e+06
> * x (x) float64 7kB 6.75e+05 6.75e+05 ... 6.843e+05 6.843e+05
> spatial_ref int32 4B 32632
> * time (time) datetime64[ns] 8B 2022-06-30
> Data variables:
> B04 (time, y, x) float32 3MB 278.0 302.0 274.0 ... 306.0 236.0
> B03 (time, y, x) float32 3MB 506.0 520.0 456.0 ... 378.0 367.0
> B02 (time, y, x) float32 3MB 237.0 240.0 249.0 ... 246.0 212.0
> B08 (time, y, x) float32 3MB 3.128e+03 2.958e+03 ... 1.854e+03
> SCL (time, y, x) float32 3MB 4.0 4.0 4.0 4.0 ... 4.0 4.0 4.0 4.0
Generate ZARR STAC from netCDF file
Convert a netCDF file to ZARR format and generate STAC items with additional metadata:
import xarray as xr
from datetime import datetime, timezone
from raster2stac import Raster2STAC
import logging
import os
import numpy as np
rs2stac = Raster2STAC(
data="S2_L2A_sample.nc",
collection_id="R2S_TEST_COLLECTION",
collection_url="https://10.8.244.74:8082/collections/",
item_prefix="R2S_TEST",
output_folder="S2_L2A_sample_ZARR",
description="Test Collection",
title="Raster2STAC Test Collection",
keywords=["test", "stac", "collection"],
providers=[
{
"url": "https://www.eurac.edu",
"name": "Eurac Research",
"roles": ["producer"],
}
],
stac_version="1.0.0",
s3_upload=False,
license="CC-BY-4.0",
sci_citation="Test citation",
).generate_zarr_stac(item_id="S2_L2A_sample_ZARR")
Case 2: create a Zarr based STAC Collection from a 5-dimensional dataset
- Get sample netCDF files:
wget https://github.com/Open-EO/openeo-localprocessing-data/raw/refs/heads/main/sample_netcdf/sample_5D.nc
- Call raster2stac:
import xarray as xr
from raster2stac import Raster2STAC
import rioxarray
ds = xr.open_dataset("sample_5D.nc").rio.write_crs(4326,inplace=True)
rs2stac = Raster2STAC(
data=ds,
collection_id="DATA_5D",
collection_url="https://stac.eurac.edu/collections/",
item_prefix="R2S_TEST",
output_folder="DATA_5D",
description="Test Collection with 5 dimensional data",
title="Raster2STAC Test Collection 5D",
keywords=["test", "stac", "collection"],
providers=[
{
"url": "https://www.eurac.edu",
"name": "Eurac Research",
"roles": ["producer"],
}
],
links= [{
"rel": "license",
"href": "https://cds.climate.copernicus.eu/api/v2/terms/static/licence-to-use-copernicus-products.pdf",
"title": "License to use Copernicus Products"
}],
stac_version="1.0.0",
s3_upload=False,
license="proprietary",
sci_doi='https://doi.org/10.24381/cds.622a565a',
sci_citation= "Schimanke S., Ridal M., Le Moigne P., Berggren L., Undén P., Randriamampianina R., Andrea U., \
Bazile E., Bertelsen A., Brousseau P., Dahlgren P., Edvinsson L., El Said A., Glinton M., Hopsch S., \
Isaksson L., Mladek R., Olsson E., Verrelle A., Wang Z.Q., (2021): CERRA sub-daily regional reanalysis \
data for Europe on single levels from 1984 to present. Copernicus Climate Change Service (C3S) Climate \
Data Store (CDS), DOI: 10.24381/cds.622a565a (Accessed on 15-02-2024)"
).generate_zarr_stac(item_id="DATA_5D")
You can then load the 5D dataset using OpenEO:
from openeo.local import LocalConnection
conn = LocalConnection("")
ds = conn.load_stac("DATA_5D/items/DATA_5D.json").execute()
> <xarray.DataArray (bands: 1, time: 216, level: 2, y: 96, x: 161, number: 25)> Size: 167MB
> dask.array<stack, shape=(1, 216, 2, 96, 161, 25), dtype=int8, chunksize=(1, 54, 1, 24, 81, 13), chunktype=numpy.ndarray>
> Coordinates:
> * level (level) int32 8B 500 850
> * number (number) int64 200B 0 1 2 3 4 5 6 7 ... 17 18 19 20 21 22 23 24
> spatial_ref int64 8B ...
> * time (time) datetime64[ns] 2kB 2016-01-01 2016-01-02 ... 2016-08-03
> * x (x) float64 1kB 5.084 5.151 5.218 5.285 ... 15.69 15.76 15.82
> * y (y) float64 768B 43.62 43.69 43.75 43.82 ... 49.86 49.93 50.0
> * bands (bands) object 8B 'z'
> Attributes:
> CDI: Climate Data Interface version 2.0.4 (https://mpimet.mpg.de...
> CDO: Climate Data Operators version 2.0.4 (https://mpimet.mpg.de...
> Conventions: CF-1.6
> history: Tue Feb 27 09:39:09 2024: cdo remapbil,/mnt/CEPH_PROJECTS/I...
Generate STAC from xarray Dataset (netCDF)
Use an existing xarray Dataset to generate STAC items in netCDF format:
import xarray as xr
from raster2stac import Raster2STAC
ds = xr.open_dataset("S2_L2A_sample.nc")
rs2stac = Raster2STAC(
data=ds, # The xarray Dataset which will be converted
collection_id="SENTINEL2_L2A_SAMPLE", # The Collection id we want to set
collection_url="https://stac.eurac.edu/collections/", # The URL where the collection will be exposed
output_folder="SENTINEL2_L2A_SAMPLE_STAC"
).generate_netcdf_stac()
Generate ZARR STAC from xarray Dataset
Use an existing xarray Dataset to generate STAC items in ZARR format:
import xarray as xr
from datetime import datetime, timezone
from raster2stac import Raster2STAC
import logging
import os
import numpy as np
rs2stac = Raster2STAC(
data=ds,
collection_id="R2S_TEST_COLLECTION",
collection_url="https://10.8.244.74:8082/collections/",
item_prefix="R2S_TEST",
output_folder="S2_L2A_sample_ZARR_dataset",
description="Test Collection",
title="Raster2STAC Test Collection",
keywords=["test", "stac", "collection"],
providers=[
{
"url": "https://www.eurac.edu",
"name": "Eurac Research",
"roles": ["producer"],
}
],
stac_version="1.0.0",
s3_upload=False,
license="CC-BY-4.0",
sci_citation="Test citation",
).generate_zarr_stac(item_id="S2_L2A_sample_ZARR")
Key Features
- Convert netCDF files to COG, netCDF, or ZARR formats
- Generate STAC-compliant items and collections
- Support for both file paths and xarray Dataset objects
- Process multiple netCDF files as a list
- Flexible metadata configuration
- Multiple output format options
- Easy data loading using pystac_client and odc.stac
- Integration with STAC APIs and clients
Common Workflow
A typical workflow with raster2stac involves:
- Convert your raster data to STAC-compliant assets using Raster2STAC
- Generate STAC collection and items in your desired format (COG/netCDF/ZARR)
- Publish or serve the STAC metadata
- Load and use the data through STAC clients using pystac_client and odc.stac
- Integrate with STAC catalogs and APIs for discovery and access
License
This project is distributed with MIT license - see 'LICENSE' for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file raster2stac-2025.12.1.tar.gz.
File metadata
- Download URL: raster2stac-2025.12.1.tar.gz
- Upload date:
- Size: 28.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7301b5719b17c496b661e496e8501d74c1b6a423fd4e08cc17afd416a7bc98b5
|
|
| MD5 |
b75a6b55d5f46ff382dabc62b6b6153d
|
|
| BLAKE2b-256 |
6158ceef0a79186b3125774b47224f6872549021359ecb8895d87a18d0c0661b
|
File details
Details for the file raster2stac-2025.12.1-py3-none-any.whl.
File metadata
- Download URL: raster2stac-2025.12.1-py3-none-any.whl
- Upload date:
- Size: 25.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6d5b8b917e60c52d9f059d5f8f9a06cd2d084fa1dab3948ec70ed940fdc1b24
|
|
| MD5 |
8b44c38852676b7b91c327ba32e40d92
|
|
| BLAKE2b-256 |
7a0a492784e8c47d15b2c2015e2428f0610183703365aff821e476715533757b
|