No project description provided
Project description
cog2zarr
TIFF to Zarr translator library which proposes a new geo Zarr v3 extension. The extension currently supports several different configurations / encodings of georeferencing information (affine transform + CRS):
- CF conventions (via
rioxarray). - GDAL raster data model (via
rioxarray). - STAC proj extension (via
pystac). - GeoTIFF (via
async-tiff).
Refer to the jsonschemas directory for JSON schemas, or the pydantic models here for more information on each configuration. See the examples for various examples from a Sentinel2 STAC item.
Caveats:
- Only works on STAC items from the
sentinel-2-l2aEarthSearch collection (https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a). It may work on Microsoft PC but I haven't tested this yet. - Ignores georectification / georeferencing edge-cases such as RPCs and GRPCs. The code assumes the image has an affine transform + CRS information, which is true for the large majority of TIFFs found in the wild.
- The
geozarr extension is stored in theattributeskey of each zarr group (node_type = 'group'). Zarr extensions are supposed to be stored at the top level of the zarr group, howeverzarr-pythondoesn't support this yet. - You must use
xarray.open_datatree("path/to/group.zarr", consolidated=True, engine="zarr")to open these.xarray.open_datasetdoes not work, and I'm not sure why.
Usage
git clone https://github.com/geospatial-jeff/cog2zarr
pip install poetry==2.1.3
poetry install
CLI
The CLI provides one command to generate JSON schemas and another to convert a STAC item to zarr.
Usage: cog2zarr [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
convert Convert STAC item to zarr
create-json-schema Create JSON schemas.
The convert command has several options which determine how the resulting Zarr store is created:
Usage: cog2zarr convert [OPTIONS] STAC_ITEM_ID OUT_STORE
Convert STAC item to zarr
Options:
--extension-type [stac|gdal|cf|geotiff]
[required]
--group-layout [planar|chunky] [required]
--chunk-size-x INTEGER [required]
--chunk-size-y INTEGER [required]
--simple / --complex [required]
--help Show this message and exit.
- Sentinel2 is typically organized into a single TIFF file per band. The
--group-layoutparameter determines how these individual TIFFs are organized into zarr groups.chunkycreates one Zarr group for each homogenous set of bands (ex.10m,20m, and60m), andplanarcreates a single Zarr group for each band. Thechunkylayout offers more efficient encoding of geospatial metadata with smaller consolidated metadata, and allows the user to potentially chunk each Zarr array across multiple bands which is ideal for accessing the same pixel across multiple bands (ex. R/G/B composite). While theplanarlayout is best for accessing individual bands, but requires duplicating the geospatial metadata multiple times. - The
geoextension contains all geospatial metadata required to georeference the Zarr array.rioxarray, by default, includes thebands,x, andyvariables when saving to Zarr. The--simpleflag may be added to drop these variables, greatly reducing the size of the Zarr store. The default (--complex) is to include these variables as they are important for interoperability with current software.
Examples
The examples included in the repo were generated with the following commands:
# First example.
cog2zarr convert S2A_33UWP_20250620_0_L2A \
examples/S2A_33UWP_20250620_0_L2A/stac_chunky_simple.zarr \
--extension-type stac \
--group-layout chunky \
--simple
# Second example.
cog2zarr convert \
S2A_33UWP_20250620_0_L2A \
examples/S2A_33UWP_20250620_0_L2A/cf_planar.zarr \
--extension-type cf \
--group-layout planar
# Third example.
cog2zarr convert \
S2A_33UWP_20250620_0_L2A \
examples/S2A_33UWP_20250620_0_L2A/gdal_chunky.zarr \
--extension-type gdal \
--group-layout chunk
Python Usage
You may also call this library through python, see the example below:
from datetime import date
from pathlib import Path
import pystac_client
from cog_to_zarr import cog_to_zarr
from cog_to_zarr.types import GeoZarrExtensionType, GroupLayout
# 1. Query Earth-Search for a recent, low-cloud Sentinel-2 L2A scene.
API = "https://earth-search.aws.element84.com/v1"
coll = "sentinel-2-l2a"
bbox = [16.20, 48.10, 16.45, 48.30] # Vienna
today = date.today()
last_year = today.replace(year=today.year - 1)
daterange = f"{last_year:%Y-%m-%d}/{today:%Y-%m-%d}"
item = next(
pystac_client.Client.open(API)
.search(
collections=[coll],
bbox=bbox,
datetime=daterange,
query={"eo:cloud_cover": {"lt": 5}},
limit=1,
)
.items(),
None,
)
cog_to_zarr.convert(
item,
Path("output.zarr"),
extension_type=GeoZarrExtensionType.stac,
group_layout=GroupLayout.chunky,
simple=True
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cog_to_zarr-0.1.0.tar.gz.
File metadata
- Download URL: cog_to_zarr-0.1.0.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.11.11 Darwin/22.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
efc03b2ffc4860e234e27eed98820c9ef7f5253da3ac5bb4fdc0fcb28804035e
|
|
| MD5 |
8cf7194b1a72d11b6d9be4594e98473d
|
|
| BLAKE2b-256 |
29a7f489a955ccac58c59a95a1429b45895013c9d2bdd5475d89fcdc77ba8102
|
File details
Details for the file cog_to_zarr-0.1.0-py3-none-any.whl.
File metadata
- Download URL: cog_to_zarr-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.11.11 Darwin/22.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1232bd1e9466f255902be06d1e6d6fd3e7361ae7428bff9c75a098be5d622b81
|
|
| MD5 |
b4a97891fa5fd9622127b721de06115e
|
|
| BLAKE2b-256 |
7d0ee2da83352001deb0d53ebc3d04520de9a2869f1d955e6118977345004727
|