Skip to main content

TerraFlow: a reproducible workflow for geospatial agricultural modeling.

Project description

TerraFlow: Reproducible Geospatial Agricultural Modeling

CI Deploy Docs Publish to PyPI Build JOSS Manuscript PyPI Homebrew Tap Python Version Quality Gate Status Codecov License: MIT

TerraFlow is a reproducible, config-driven geospatial workflow for agricultural suitability modeling. Give it a land-cover raster, a climate CSV, and a YAML config — it returns a scored, location-stamped results table with full provenance.

Documentation: terraflow.marupilla.dev — see the Reproducibility page for what the run fingerprint covers and known sources of non-determinism.


Installation

macOS (Homebrew) — handles GDAL and PROJ automatically:

brew tap gmarupilla/terraflow
brew install terraflow

pip / uv:

uv pip install terraflow-agro
# or
pip install terraflow-agro

For kriging-based interpolation:

pip install terraflow-agro pykrige

See Homebrew install docs for update/uninstall instructions and troubleshooting.

Quickstart

terraflow --config config.yml

A minimal config:

raster_path: "data/land_cover.tif"
climate_csv: "data/climate.csv"
output_dir: "outputs"
roi:
  type: bbox
  xmin: -120.5
  ymin: 34.0
  xmax: -118.0
  ymax: 35.5
model_params:
  v_min: 0.0
  v_max: 1.0
  t_min: 10.0
  t_max: 35.0
  r_min: 100.0
  r_max: 800.0
  w_v: 0.4
  w_t: 0.3
  w_r: 0.3

Results are written to outputs/runs/<fingerprint>/:

features.parquet   — scored cells (lat, lon, score, label, …)
results.csv        — same data in CSV
manifest.json      — full provenance record
report.json        — QA stats and timings

CLI subcommands

Subcommand Purpose
terraflow run -c config.yml Run the full pipeline
terraflow sensitivity -c config.yml Sobol' / Morris sensitivity indices for model weights
terraflow validate -c config.yml Spatial block CV, Cohen's kappa, Moran's I on residuals
terraflow export --format h3 -c config.yml H3-indexed export for interop with H3-native visualization tools (pip install terraflow-agro[h3])

See CLI docs for full reference.

Climate interpolation

Three spatial algorithms are available via interpolation_method:

Method Notes
linear (default) scipy.griddata — fast, no extra deps
kriging Ordinary Kriging via pykrige; adds {var}_krig_std uncertainty columns
idw Inverse Distance Weighting (power=2) — faster than kriging, no uncertainty

Combine interpolation_method: kriging with uncertainty_samples: N in model_params to get Monte Carlo score confidence intervals (score_ci_low / score_ci_high). For kriging, variogram_mode: extended evaluates additional nested variogram candidates and records all LOOCV candidate scores in report.json; use the default standard mode for large station networks unless nested structures are needed. See the extended variogram notebook in the docs for a worked synthetic example.

See Config Schema for the full reference.

Python API

from terraflow.pipeline import run_pipeline

results_df = run_pipeline("config.yml")

Development

git clone https://github.com/gmarupilla/AgroTerraFlow.git
cd AgroTerraFlow
make dev       # create .venv and install dev deps
make test      # run test suite
make lint      # ruff + black
make docs-build

Architecture

Core modules: cli, config, climate, geo, ingest, model, pipeline, stats, viz.

Key design decisions are documented in Architecture Decision Records under docs/architecture/.

Project Scope

TerraFlow is a reproducible pipeline for geospatial agricultural modeling. It handles raster ingestion, ROI clipping, climate interpolation, suitability scoring, and deterministic artifact generation.

In scope:

  • Configuration-driven pipeline execution (YAML → Parquet + provenance artifacts)
  • Spatial interpolation of point climate observations (linear, kriging, IDW)
  • Per-cell suitability scoring with uncertainty quantification (Monte Carlo)
  • Deterministic run fingerprinting and artifact caching

Out of scope:

  • Real-time data ingestion or streaming workflows
  • General-purpose raster analysis (use rioxarray or rasterstats instead)
  • Cloud-scale distributed processing (no Dask/Spark integration planned)
  • Web application or GUI layer

Maintenance & Support

TerraFlow is actively maintained. Bug fixes are prioritized; the test suite and CI pipeline are kept green on every commit.

Feature requests are evaluated against project scope — open an issue to discuss before building. Not all requests will be accepted.

Support is provided on a best-effort basis via GitHub Issues. Response time is typically within a week. There is no paid support tier.

Contributing

See CONTRIBUTING.md.

Citation

If you use TerraFlow in your research, please cite our JOSS paper (manuscript in preparation).

License

MIT License — free for academic, commercial, and open-source use.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

terraflow_agro-0.3.0.tar.gz (81.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

terraflow_agro-0.3.0-py3-none-any.whl (51.3 kB view details)

Uploaded Python 3

File details

Details for the file terraflow_agro-0.3.0.tar.gz.

File metadata

  • Download URL: terraflow_agro-0.3.0.tar.gz
  • Upload date:
  • Size: 81.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for terraflow_agro-0.3.0.tar.gz
Algorithm Hash digest
SHA256 3e66fdd9c8f2800889ad0321b1d93f6c06a5626a91c1e0706086ee52d54ce5b3
MD5 c7039b318874a7f82f1ec0f5b2096be6
BLAKE2b-256 bcafdc99e857c0a2bd7c7cdd537fa4c87b986eaa075a6996202e1935b210ebac

See more details on using hashes here.

File details

Details for the file terraflow_agro-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: terraflow_agro-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 51.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for terraflow_agro-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0a311a1178dacea5ba3eade4520ae7cf1128ad2a8713aa1b4ef49e2cfabb402e
MD5 cb5ff63bed6ab6d0f91157c56a704db2
BLAKE2b-256 3d743b9c6f89ffaf60516b7ed08f1b46f508cdbcf943bcd79845b5786b7e2026

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page