Multi-resolution aggregation for ICESat-2 ATL06 data using morton/healpix indexing

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

espg

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3.12
- Python :: 3.13
Topic
- Scientific/Engineering :: GIS

Project description

zagg - Multi-resolution Aggregation

Aggregate point observations to multi-resolution grids using HEALPix spatial indexing and serverless compute.

Overview

zagg aggregates sparse point data (e.g., ICESat-2 ATL06 elevation measurements) to gridded products using HEALPix/morton spatial indexing. Processing runs in parallel on AWS Lambda — each worker handles one spatial cell independently, writing to a shared Zarr v3 store following the DGGS convention.

Features

Pre-computed granule catalogs — query CMR once, process many times
Morton-based spatial indexing — HEALPix nested scheme for hierarchical grids
Massive parallelism — tested with up to 1,700 concurrent Lambda workers
Direct S3 access — h5coro reads HDF5 via byte-range requests, no downloads
Cost-effective — ~~$0.006/cell (~~$2 per full Antarctica run on ARM64)

End-to-End Workflow

Step 1: Build a Granule Catalog

Query NASA's CMR to build a mapping of spatial cells to granule S3 URLs.

# ICESat-2 convenience — cycle number computes dates automatically:
uv run python -m zagg.catalog --cycle 22 --parent-order 6

# General — explicit date range and spatial polygon:
uv run python -m zagg.catalog \
    --start-date 2024-01-06 --end-date 2024-04-07 \
    --short-name ATL06 \
    --polygon my_region.geojson \
    --parent-order 6

When --polygon is provided, the bounding box for the CMR query is computed automatically from the polygon's extent, and morton_coverage uses the polygon for cell discovery. When no polygon is given, Antarctic drainage basins are used as the default.

Output: catalog_ATL06_2024-01-06_2024-04-07_order6.json

See Catalog API for full options.

Step 2: Deploy the Lambda Function

Build and deploy the Lambda function and its dependency layer.

# Build the function package
bash deployment/aws/build_function.sh

# Build the dependency layer (ARM64)
bash deployment/aws/build_arm64_layer.sh

# Deploy
bash deployment/aws/deploy.sh

See Lambda Deployment and ARM64 Build Guide.

Step 3: Run Processing

Processing reads a pipeline config YAML (data source, aggregation, output store) and a granule catalog. Run locally or dispatch to Lambda.

# Local processing (write to local Zarr):
uv run python -m zagg --config atl06.yaml --catalog catalog.json --store ./output.zarr

# Local processing (write to S3):
uv run python -m zagg --config atl06.yaml --catalog catalog.json --store s3://bucket/output.zarr

# Lambda dispatch (requires deployed Lambda function):
uv run python deployment/aws/invoke_lambda.py \
    --config atl06.yaml --catalog catalog.json

# Test with a few cells:
uv run python -m zagg --config atl06.yaml --catalog catalog.json --max-cells 5

# Dry run:
uv run python -m zagg --config atl06.yaml --catalog catalog.json --dry-run

The store path and output grid parameters are defined in the YAML config (output.store, output.grid.child_order) and can be overridden via --store on the command line.

Step 4: Visualize Results

The output Zarr is a public DGGS dataset. The included notebook rasterizes HEALPix cells to a polar stereographic grid for fast rendering with imshow.

uv run jupyter notebook notebooks/rasterized_zarr.ipynb

Adjust GRID_SPACING in the notebook to control output resolution (default 2 km).

Project Structure

zagg/
├── src/zagg/              # Main package (cloud-agnostic)
│   ├── __main__.py        # Local processing runner (python -m zagg)
│   ├── config.py          # YAML pipeline configuration
│   ├── processing.py      # Core aggregation pipeline
│   ├── catalog.py         # CMR query + catalog building
│   ├── schema.py          # Output schema + Zarr template
│   ├── store.py           # Store factory (local or S3)
│   ├── auth.py            # NASA Earthdata authentication
│   └── configs/           # Built-in pipeline configs (atl06.yaml)
├── deployment/            # Cloud-specific deployment
│   └── aws/               # Lambda handler, orchestrator, build scripts
├── notebooks/             # Visualization
├── docs/                  # Documentation
└── tests/                 # Test suite

Documentation

Architecture — design philosophy, end-to-end flow diagram, key decisions
Schema — aggregation dispatch, extending with new statistics
API Reference — catalog, processing, schema, auth modules
Lambda Deployment — AWS setup and production use
ARM64 Build Guide — building Lambda layers for ARM64

Development

# Install
uv sync --all-groups

# Run tests
uv run pytest

# Lint
uv run ruff check src/

Requires Python >= 3.12, uv, AWS credentials (for Lambda), and a NASA Earthdata account (for data access).

Performance

Metric	Value
Execution time	2–3 min average per cell
Memory	2 GB configured, 1–1.5 GB typical
Throughput	Tested with up to 1,700 concurrent workers
Cost	~~$0.006/cell (~~$2 per full Antarctica run on ARM64)

License

MIT — see LICENSE file.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

espg

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3.12
- Python :: 3.13
Topic
- Scientific/Engineering :: GIS

Release history Release notifications | RSS feed

This version

0.1.0

Apr 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zagg-0.1.0.tar.gz (38.0 kB view details)

Uploaded Apr 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

zagg-0.1.0-py3-none-any.whl (31.5 kB view details)

Uploaded Apr 20, 2026 Python 3

File details

Details for the file zagg-0.1.0.tar.gz.

File metadata

Download URL: zagg-0.1.0.tar.gz
Upload date: Apr 20, 2026
Size: 38.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for zagg-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b66250991678c497a7db150bf1e498dad7e6de27a7406fa268129daa48926745`
MD5	`ea3aff89e17f61c50c30793d490e2f09`
BLAKE2b-256	`cefa6508442f87040e9f1595e88ff60c88ceb68118595672deba990ddedf1759`

See more details on using hashes here.

Provenance

The following attestation bundles were made for zagg-0.1.0.tar.gz:

Publisher: publish.yml on englacial/zagg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: zagg-0.1.0.tar.gz
- Subject digest: b66250991678c497a7db150bf1e498dad7e6de27a7406fa268129daa48926745
- Sigstore transparency entry: 1343716953
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: englacial/zagg@3ee5b58ecac471eea0779755cb230fdace5c3147
- Branch / Tag: refs/tags/0.1.0
- Owner: https://github.com/englacial
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@3ee5b58ecac471eea0779755cb230fdace5c3147
- Trigger Event: push

File details

Details for the file zagg-0.1.0-py3-none-any.whl.

File metadata

Download URL: zagg-0.1.0-py3-none-any.whl
Upload date: Apr 20, 2026
Size: 31.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for zagg-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c5b31040c92057ed738c49c5fe75119657f87b0c73c7850fae4d1f5dd9d9dc6d`
MD5	`f2461fc2c91d70b1cbfec58d4a2531bd`
BLAKE2b-256	`3d7ffe99110dc122565a7654c1fff00fa1f4be8659d68d7753058b7e9a268917`

See more details on using hashes here.

Provenance

The following attestation bundles were made for zagg-0.1.0-py3-none-any.whl:

Publisher: publish.yml on englacial/zagg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: zagg-0.1.0-py3-none-any.whl
- Subject digest: c5b31040c92057ed738c49c5fe75119657f87b0c73c7850fae4d1f5dd9d9dc6d
- Sigstore transparency entry: 1343716954
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: englacial/zagg@3ee5b58ecac471eea0779755cb230fdace5c3147
- Branch / Tag: refs/tags/0.1.0
- Owner: https://github.com/englacial
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@3ee5b58ecac471eea0779755cb230fdace5c3147
- Trigger Event: push

zagg 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

zagg - Multi-resolution Aggregation

Overview

Features

End-to-End Workflow

Step 1: Build a Granule Catalog

Step 2: Deploy the Lambda Function

Step 3: Run Processing

Step 4: Visualize Results

Project Structure

Documentation

Development

Performance

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance