Cloud-native geospatial developer toolkit
Project description
EarthForge
Working with cloud-native geospatial data means juggling gdalinfo for COGs, stac-client for discovery, geopandas for GeoParquet, xarray for Zarr, and a collection of one-off scripts to glue them together. Each tool has its own CLI conventions, its own output format, and its own assumptions about how you authenticate to cloud storage.
EarthForge is a single composable toolkit that unifies these workflows. One CLI. One config system. One output contract. Every command works locally, against S3, GCS, or Azure — and every command produces both human-readable tables and machine-parseable JSON.
# Inspect any cloud-native geospatial file — format auto-detected
earthforge info s3://bucket/image.tif
earthforge info buildings.parquet
earthforge info climate.zarr
# Search STAC catalogs
earthforge stac search sentinel-2-l2a --bbox -85,37,-84,38 --datetime 2025-06/2025-09
# Generate a quicklook preview from a remote COG without downloading it
earthforge raster preview s3://bucket/scene.tif -o preview.png
# Convert legacy formats to cloud-native
earthforge vector convert buildings.shp --to geoparquet
earthforge raster convert image.tif --to cog
# Query GeoParquet with spatial predicate pushdown
earthforge vector query buildings.parquet --bbox -85,37,-84,38
# Inspect and slice Zarr datacubes
earthforge cube info s3://era5-pds/zarr/2025/01/data/air_temperature_at_2_metres.zarr
earthforge cube slice s3://era5-pds/zarr/ --var t2m --bbox -85,37,-84,38 --time 2025-06/2025-06 -o ky_june.zarr
# Pipe structured JSON into other tools
earthforge stac search sentinel-2-l2a -o json | jq '.items[].assets.B04.href'
What EarthForge Is
EarthForge is a library-first, CLI-first developer toolkit. Install it as a Python library and call functions directly, or use the CLI from shell scripts and pipelines. Every CLI command is a thin wrapper around a library function, so anything you can do from the terminal you can also do from Python, a Jupyter notebook, or a pipeline runner.
from earthforge.raster.info import inspect_raster
from earthforge.stac.search import search_catalog
# Library usage — same logic as the CLI, no subprocess needed
items = await search_catalog("sentinel-2-l2a", bbox=(-85, 37, -84, 38))
metadata = await inspect_raster("s3://bucket/scene.tif")
Real-World Output
The samples below are actual outputs from EarthForge commands run against public geospatial data. Sample files live in data/samples/.
KyFromAbove 3-inch Orthoimagery — fetched thumbnail
earthforge stac fetch \
https://spved5ihrl.execute-api.us-west-2.amazonaws.com/collections/orthos-phase3/items/N097E305_2024_Season1_3IN_cog \
--assets thumbnail --output-dir data/kyfromabove_fetch
# → 78,026 bytes in 2.34s
3-inch orthoimagery, KyFromAbove Phase 3 (2024). Public domain. Full COG available at kyfromabove.s3.us-west-2.amazonaws.com.
Sentinel-2 STAC Search — --output json
earthforge stac search sentinel-2-l2a \
--bbox -85,37,-84,38 --datetime 2025-06/2025-09 --max-items 5 \
--output json
{
"collection": "sentinel-2-l2a",
"matched": 47,
"returned": 5,
"elapsed_seconds": 1.243,
"items": [
{
"id": "S2A_18SYJ_20250914_0_L2A",
"datetime": "2025-09-14T16:28:43Z",
"properties": { "eo:cloud_cover": 4.2, "platform": "sentinel-2a" }
}
]
}
Full sample: data/samples/stac_search.json
COG Metadata — earthforge raster info
earthforge raster info \
https://sentinel-cogs.s3.us-west-2.amazonaws.com/.../B04.tif \
--output json
{
"format": "COG",
"width": 10980, "height": 10980,
"crs": "EPSG:32618",
"is_tiled": true, "tile_width": 512, "tile_height": 512,
"overview_count": 6,
"compression": "deflate"
}
Full sample: data/samples/raster_info.json
GeoParquet Metadata — earthforge vector info
earthforge vector info ky_wildlife_management_areas.parquet --output json
{
"format": "geoparquet",
"row_count": 83,
"geometry_types": ["MultiPolygon"],
"crs": "EPSG:4326",
"bbox": [-89.57, 36.49, -81.96, 39.15],
"compression": "SNAPPY",
"file_size_bytes": 142863
}
Full sample: data/samples/vector_info.json
Output Gallery
All images below were generated from real-world data using EarthForge example scripts. No synthetic or simulated data. Each output includes a .txt sidecar with alt text, data provenance, and generation metadata. See examples/outputs/ for full details.
Grand Canyon — DEM with Hillshade + Cross-Section
SRTM 30m elevation data via OpenTopography API. Shows 1,844m of relief from river to rim with elevation cross-section profile.
Swiss Alps — Matterhorn/Zermatt Elevation Analysis
Copernicus DEM 30m via OpenTopography. Elevations from 1,868m (valley) to 4,330m (peaks) with statistics sidebar.
Colorado Front Range — Sentinel-2 NDVI
Vegetation gradient from plains to alpine tundra, showing elevation-driven ecology. BrBG colorblind-safe diverging palette.
Netherlands — Urban/Water/Vegetation NDVI
Sentinel-2 scene over Rotterdam/Delft showing water (NDVI < 0), urban (low NDVI), and agricultural areas (high NDVI).
Amazon Rainforest — Tropical NDVI
Sentinel-2 scene near Manaus, Brazil showing dense tropical forest canopy with uniformly high NDVI.
Copernicus DEM — Elevation Statistics + Histogram
Raster statistics computed from a Copernicus DEM 30m tile with elevation distribution histogram. Viridis colorblind-safe palette.
Yellowstone — Landsat STAC Search Footprints
Landsat Collection 2 Level-2 scene footprints from Earth Search, color-coded by cloud cover percentage.
Yosemite — Multi-Collection STAC Query
Two-panel figure querying both Sentinel-2 scenes and Copernicus DEM tiles from a single STAC API.
STAC-to-NDVI Pipeline
Complete pipeline workflow: STAC search, range-read Sentinel-2 bands, NDVI computation via safe expression evaluator, rendered output with pipeline summary.
Format Detection Matrix
EarthForge's three-stage format detection chain (magic bytes, extension, content inspection) tested across 12 geospatial file formats.
What EarthForge Is Not
EarthForge is not a platform. It does not include a web server, a tile cache, a database, an ML pipeline, or a Kubernetes deployment. It is not a replacement for QGIS, ArcGIS, or Google Earth Engine. It does not try to be everything — it is a focused set of tools that integrate with existing workflows via structured output, stdin/stdout piping, and Python imports.
If you need a tile server, use TiTiler. If you need a STAC API, use stac-fastapi. If you need a geospatial database, use PostGIS. EarthForge is the CLI toolkit you reach for alongside those tools, not instead of them.
Install
# Full toolkit
pip install earthforge[all]
# Just what you need
pip install earthforge[stac] # STAC discovery only
pip install earthforge[raster] # COG operations only
pip install earthforge[vector] # GeoParquet operations only
pip install earthforge[cube] # Zarr datacube operations only
pip install earthforge[cli] # CLI framework only
Cloud Storage
EarthForge uses named profiles for cloud storage authentication, similar to AWS CLI profiles:
# Initialize config
earthforge config init
# Search with a specific profile
earthforge stac search sentinel-2-l2a --profile planetary
Profiles are defined in ~/.earthforge/config.toml:
[profiles.default]
stac_api = "https://earth-search.aws.element84.com/v1"
storage = "s3"
[profiles.planetary]
stac_api = "https://planetarycomputer.microsoft.com/api/stac/v1"
storage = "azure"
Architecture
EarthForge is built as a monorepo with independently installable workspace packages. The architecture is documented in detail — not as an afterthought, but as the foundation the implementation is built on.
- ARCHITECTURE.md — System design, dependency graph, module interfaces
- ai-dev/decisions/ — Architectural decision records with alternatives considered and tradeoffs acknowledged
- ai-dev/spec.md — Requirements and acceptance criteria per milestone
Key architectural decisions:
| Decision | Record | Summary |
|---|---|---|
| Monorepo structure | DL-001 | Single repo with Hatch workspace packages, not 15 separate repos |
| Async-first I/O | DL-002 | All network I/O is async via httpx; sync wrappers for convenience |
| obstore for storage | DL-003 | Rust-backed S3/GCS/Azure abstraction over fsspec |
| Rust extension boundary | DL-005 | Rust for format detection and range reads; Python for everything else |
| Engineering credibility | DL-006 | Nothing ships empty; decisions before code; scope boundaries enforced |
| promptfoo evaluation | DL-007 | Agent prompts and guardrails regression-tested in CI via promptfoo |
Formats
| Format | Support | Operations |
|---|---|---|
| COG (Cloud Optimized GeoTIFF) | Full | info, validate, convert, preview, band math, tile |
| GeoParquet | Full | info, validate, convert, query, clip, tile |
| Zarr | Full | info, validate, convert, slice, stats |
| FlatGeobuf | Read/Write | info, validate, convert |
| COPC (Cloud Optimized Point Cloud) | Info | info |
| STAC (SpatioTemporal Asset Catalog) | Full | search, info, validate, fetch, publish |
Contributing
See CONTRIBUTING.md. EarthForge has specific engineering standards — please read the contribution guide before opening a PR.
Code of Conduct
Security
See SECURITY
License
GNU General Public License v3.0. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file earthforge-1.0.0.tar.gz.
File metadata
- Download URL: earthforge-1.0.0.tar.gz
- Upload date:
- Size: 17.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a15901423017409928ce52b8a6f33a6ad9043ec4627ce431e125bb1dced86ffe
|
|
| MD5 |
878ff94066e4abae3bee4acb6827e8f4
|
|
| BLAKE2b-256 |
5d75e73060c0cd2ffe5323cbb89868aeb826eea3193294a72558cd92688345f9
|
File details
Details for the file earthforge-1.0.0-py3-none-any.whl.
File metadata
- Download URL: earthforge-1.0.0-py3-none-any.whl
- Upload date:
- Size: 31.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51aa0ee31fdfd2d136f484a38897f1d335518d0c66dc30e1fd84d751357b1efd
|
|
| MD5 |
b35dc65b17f69257f16a8164bc6f5427
|
|
| BLAKE2b-256 |
48c628872367fa5bf8f495b6e33e5795870fefd3700588461d46e6e028dc7f2a
|