AI-Hydro: hydrological research tools as an MCP server for AI agents
Project description
aihydro-tools
Stop writing plumbing. Give AI agents real hydrological superpowers.
What is aihydro-tools?
aihydro-tools is the Python backbone of the AI-Hydro platform. It turns a conversation with an AI agent into real hydrological computation — watershed delineation, streamflow retrieval, signature extraction, terrain analysis, and model calibration — with full structured provenance recorded automatically at every step.
Tools are exposed via the Model Context Protocol (MCP), the open standard for agent-tool communication. Any AI model that supports MCP — Claude, GPT, Gemini — can call these tools directly, without writing a single line of processing code. And because aihydro-tools is built as a community platform, any researcher can register domain-specific tools (flood frequency, sediment transport, groundwater, remote sensing) via Python entry points, extending the ecosystem without touching the core.
Quick Start
# Install
pip install aihydro-tools[all]
# Verify
aihydro-mcp --diagnose
# Run the server
aihydro-mcp
The AI-Hydro VS Code extension auto-detects aihydro-mcp on startup — no manual configuration needed.
Built-in Tools
Analysis Tools
| Category | Tool | Description |
|---|---|---|
| Watershed | delineate_watershed |
NHDPlus watershed delineation from USGS NLDI given a gauge ID |
| Streamflow | fetch_streamflow_data |
Daily discharge time series from USGS NWIS |
| Signatures | extract_hydrological_signatures |
15+ flow statistics: BFI, runoff ratio, FDC percentiles, recession constants |
| Geomorphic | extract_geomorphic_parameters |
28 basin morphometry metrics (area, slope, elevation, shape factors) |
| Terrain | compute_twi |
Topographic Wetness Index from 3DEP 10m DEM |
| Curve Number | create_cn_grid |
NRCS Curve Number grid from NLCD land cover + Polaris soils |
| Forcing | fetch_forcing_data |
Basin-averaged GridMET climate data (prcp, tmax, tmin, PET, srad, wind) |
| CAMELS | extract_camels_attributes |
Full CAMELS-US attribute set (671 gauges) via pygeohydro |
| Modelling | train_hydro_model |
Differentiable HBV-light (PyTorch) or NeuralHydrology LSTM |
| Modelling | get_model_results |
Retrieve cached model performance (NSE, KGE, RMSE) |
| Session | start_session |
Initialize or resume a per-gauge research session |
| Session | get_session_summary |
Overview of computed and pending analysis slots |
| Session | clear_session |
Reset cached results to force re-computation |
| Session | add_note |
Attach research notes to the session |
| Session | export_session |
Export a reproducible research capsule with data, figures, methods, and environment |
| Session | get_session_raw_state |
Retrieve raw computed results for LLM interpretation (Phase 1 of two-phase split) |
| Session | write_research_interpretation |
Store LLM-authored scientific interpretation (Phase 2 of two-phase split) |
| Session | archive_session |
Archive completed session to a timestamped ZIP |
| Session | merge_session_shards |
Merge parallel sub-agent shards into the main session |
| Ledger | add_claim |
Add a scoped scientific claim to the session ledger |
| Ledger | update_claim_status |
Update the status/confidence of an existing claim |
| Ledger | list_claims |
List all claims in the session (filterable by status) |
| Ledger | add_assumption |
Record a scientific assumption or caveat |
| Ledger | list_assumptions |
List all assumptions in the session |
| Ledger | promote_claim_to_registry |
Promote a validated claim to the global knowledge registry |
| Ledger | draft_claim_from_run |
Auto-draft a claim pre-filled with evidence from a Tier 1 tool run |
| Validators | check_water_balance_consistency |
Flag mass-balance violations in signature + streamflow data |
| Validators | check_temporal_alignment |
Verify forcing and streamflow cover the same time window |
| Validators | check_unit_consistency |
Confirm a session slot carries the expected physical units |
| Visualization | show_on_map |
Push any GeoJSON geometry onto the AI-Hydro map panel |
| Discover | list_available_tools |
Enumerate all installed tools, including community plugins |
| Discover | list_skills |
List available workflow playbooks by domain |
| Discover | load_skill |
Load a workflow playbook for a multi-step analysis |
| Discover | get_library_reference |
Retrieve an API idiom card for a Python library |
| Discover | list_relevant_clis |
List relevant external CLI tools |
| Discover | get_variable_definition |
Look up a hydrology variable by ID (units, aliases, notes) |
| Discover | get_metric_definition |
Look up a performance metric by ID |
| Discover | get_dataset_definition |
Look up a dataset by ID (provider, resolution, variables) |
Project, Literature & Researcher Memory
| Category | Tool | Description |
|---|---|---|
| Project | start_project |
Create or resume a named research project spanning multiple gauges or topics |
| Project | get_project_summary |
Overview of all gauges, journal entries, literature, and metrics in a project |
| Project | add_gauge_to_project |
Associate a gauge session with the active project |
| Project | search_experiments |
Full-text search across all gauge sessions in a project |
| Literature | index_literature |
Scan a folder of PDF, txt, or md files and build a searchable text index |
| Literature | search_literature |
Query the index and return excerpts for the agent to synthesise |
| Literature | add_journal_entry |
Log a timestamped experiment note to the project journal |
| Persona | get_researcher_profile |
Retrieve the persistent researcher profile (expertise, focus, preferences) |
| Persona | update_researcher_profile |
Update profile fields — agent or researcher driven |
| Persona | log_researcher_observation |
Record an observation about the researcher's evolving interests and methods |
Community plugins can add further tools via the entry-point system (see below).
Data Sources
All data is fetched from authoritative federal sources:
- USGS NWIS — daily streamflow via dataretrieval (official USGS Python client)
- NHDPlus / NLDI — watershed delineation via pynhd
- GridMET — climate forcing via pygridmet
- 3DEP — DEM and terrain analysis via py3dep
- NLCD — land cover classification
- POLARIS — soil properties
- CAMELS-US — catchment attributes via pygeohydro
Memory & Provenance
AI-Hydro maintains a three-tier memory hierarchy so research context survives between conversations, sessions, and projects.
HydroSession — per-gauge state at ~/.aihydro/sessions/<gauge_id>.json. Expensive computations (watershed delineation, multi-year streamflow downloads) are done once and reused across days or weeks. Every result carries structured provenance metadata — data source, parameters, timestamp — making reproducibility a natural byproduct rather than a documentation chore.
ProjectSession — project-scoped state at ~/.aihydro/projects/<name>/project.json. Organises research across multiple gauges, topics, or datasets. Supports cross-session experiment search, a timestamped journal, and literature indexing.
ResearcherProfile — a persistent persona at ~/.aihydro/researcher.json. Built up from agent-researcher interactions over time: expertise areas, preferred models, active projects, and accumulated observations. Injected into every conversation automatically so the agent knows who it is working with.
Scientific Trust
AI-Hydro goes beyond computation — it records why results should be believed and what remains uncertain.
Tool Tier System
Every tool is assigned to one of three evidence tiers (machine-readable via get_tool_tier(name)):
| Tier | Label | Automatic enforcement |
|---|---|---|
| 1 | Scientific output | quality_flags injected into every result; _run_id minted for evidence binding |
| 2 | Workflow / data | No automatic enforcement; best-effort provenance |
| 3 | Infrastructure | No validation load; session plumbing only |
Tier 1 tools (watershed delineation, signatures, TWI, CN, model training) fire registered post-run validators automatically. Every Tier 1 result carries:
quality_flags— list of validator outcomes (pass/warning/failwith severity)_run_id— stable evidence-binding key for linking claims to specific runs
Scientific Ledger
Results become actionable knowledge through the ledger:
draft_claim_from_run(session_id, run_id, metric_ref)— reads the run log for any Tier 1 tool call and returns a claim template withevidence_spanspre-populated. The agent authors only the scientific interpretation.add_claim(..., evidence_spans=[...])— records the claim.EvidenceSpanties the claim to a run, paper, or dataset with typed attribution (source_type,source_id,metric_ref).promote_claim_to_registry(..., researcher_approved=True)— passes a promotion gate: at least oneevidence_span, at least onelimitation, and statussupportedorweakly_supported. Researcher approval is required.
Verified Knowledge
Built-in knowledge entries can be marked verified: true in the YAML registry (e.g., metric.kge, variable.streamflow, dataset.usgs_nwis). Verified entries require scientific_justification — not just overrides + override_reason — in workspace override files, ensuring overrides of peer-reviewed conventions are deliberate and documented.
Retrieve all verified entries programmatically:
from ai_hydro.knowledge.loader import get_verified_knowledge
verified = get_verified_knowledge() # {"variables": [...], "metrics": [...], "datasets": [...]}
Quality Assurance
Automatic Post-Run Validation
Tier 1 tools fire registered validators automatically (no agent action needed). Currently active wiring:
extract_hydrological_signatures→check_water_balance_consistencyfetch_streamflow_data→check_unit_consistency(expected: m³/s)
Validators never raise — failures appear in quality_flags without crashing the tool. Register additional validators via:
from ai_hydro.mcp.enforcement import register_post_validator
register_post_validator("my_tool", my_validator_fn, lambda sid: {"session_id": sid})
aihydro-bench
A deterministic fixture benchmark suite (bench/tasks.yaml, ~26 tasks) verifies every core computation path without live network calls. Tasks span all tiers:
- Group A–C: validators, compute functions, and their edge cases
- Group D–G: ledger gates, knowledge registry guards, conflict resolution
- Group H–J: enforcement layer, verified namespace, claim coupling (draft → add)
Run it:
pytest tests/test_bench.py -m bench -v # fast fixture suite (no network)
pytest tests/test_bench.py -m bench_live -v # live USGS calls (nightly CI only)
Extending with Plugins
aihydro-tools is a platform, not a closed product. Any researcher can package domain knowledge as a plugin and make it immediately available to every AI agent that uses AI-Hydro — flood frequency analysis, sediment transport, groundwater modelling, remote sensing workflows, or anything else the core doesn't yet cover.
Entry-point plugins load into the same process with full access to HydroSession and cached data:
# In your package's pyproject.toml
[project.entry-points."aihydro.tools"]
my_tool = "my_package.tools:my_tool_function"
Install the package, restart the server, and the tool is automatically discovered — no changes to the core required.
Standalone MCP servers let you build fully independent toolkits with their own dependencies, registered alongside the core ai-hydro server.
See the Plugin Guide for complete walkthroughs of both paths, the data contract, and session integration.
Use as a Python Library
You don't need an AI agent to benefit from aihydro-tools. Every tool is a regular Python function — import and call directly in scripts, notebooks, or pipelines:
from ai_hydro.analysis.watershed import delineate_watershed
from ai_hydro.data.streamflow import fetch_streamflow_data
from ai_hydro.analysis.signatures import extract_hydrological_signatures
# Delineate a watershed
ws = delineate_watershed("01031500")
print(f"Watershed area: {ws.data['area_km2']} km2")
# Fetch streamflow
sf = fetch_streamflow_data("01031500", start_date="2015-01-01", end_date="2024-12-31")
print(f"Records: {len(sf.data['dates'])} days")
# Extract signatures
sigs = extract_hydrological_signatures("01031500")
print(f"Baseflow index: {sigs.data['baseflow_index']}")
All functions return HydroResult objects with .data (dict) and .meta (provenance metadata).
Installation Details
Extras
Install only what you need:
| Extra | What it adds | Install command |
|---|---|---|
[data] |
Streamflow, forcing, land cover, soil, CAMELS retrieval | pip install aihydro-tools[data] |
[analysis] |
Watershed, signatures, TWI, geomorphic, curve number | pip install aihydro-tools[analysis] |
[modelling] |
PyTorch differentiable HBV-light, NeuralHydrology LSTM | pip install aihydro-tools[modelling] |
[viz] |
Matplotlib, Plotly, Folium visualisations | pip install aihydro-tools[viz] |
[all] |
Everything above | pip install aihydro-tools[all] |
PATH Troubleshooting
If aihydro-mcp is not found after install, pip placed it outside your PATH:
| OS | Typical location |
|---|---|
| Windows (user) | %APPDATA%\Python\Python3XX\Scripts\aihydro-mcp.exe |
| Windows (system) | C:\Python3XX\Scripts\aihydro-mcp.exe |
| macOS/Linux (user) | ~/.local/bin/aihydro-mcp |
| macOS/Linux (system) | /usr/local/bin/aihydro-mcp |
| Conda | ~/miniconda3/bin/aihydro-mcp or ~/anaconda3/bin/aihydro-mcp |
Universal fallback: python -m ai_hydro.mcp works regardless of PATH. The AI-Hydro extension auto-detects both the console script and the module fallback.
Extending with Plugins
AI-Hydro uses Python entry points for a clean plugin system. Community packages can contribute any of four capability layers without modifying the core:
| Entry-point group | Contributes | Served by |
|---|---|---|
aihydro.tools |
MCP tool functions | list_available_tools() |
aihydro.knowledge |
Library reference cards (JSON) | get_library_reference() |
aihydro.skills |
Workflow playbooks (SKILL.md) | list_skills() / load_skill() |
aihydro.clis |
CLI descriptor (binary + help) | list_relevant_clis() |
Example: add a custom tool
# my_hydro_pkg/tools.py
from ai_hydro.core.types import HydroResult, HydroMeta, DataSource
def compute_soil_moisture(session_id: str, workspace_dir: str = None) -> dict:
"""Estimate soil moisture from session forcing data."""
# ... your computation ...
result = HydroResult(data={...}, meta=HydroMeta(tool="compute_soil_moisture", ...))
return result.to_dict()
# pyproject.toml
[project.entry-points."aihydro.tools"]
compute_soil_moisture = "my_hydro_pkg.tools:compute_soil_moisture"
After pip install my-hydro-pkg, the tool appears automatically in list_available_tools() on the next MCP server restart — no changes to aihydro-tools required.
Example: add a knowledge card
[project.entry-points."aihydro.knowledge"]
my_lib = "my_hydro_pkg.knowledge:get_refs_dir"
where get_refs_dir() returns a Path to a directory of *.json cards (same schema as the built-in cards in ai_hydro/knowledge/library_refs/).
Example: add a workflow skill
[project.entry-points."aihydro.skills"]
my_skills = "my_hydro_pkg.skills:get_skills_dir"
where get_skills_dir() returns a Path to a directory of *.md skill files (YAML frontmatter + markdown body). Skills appear in list_skills() and are loadable via load_skill(name).
Example: advertise a CLI
[project.entry-points."aihydro.clis"]
my_tool = "my_hydro_pkg.aihydro.cli_descriptor:descriptor"
where descriptor() returns {name, binary, description, help_subcommand}. The CLI appears in list_relevant_clis().
See the Plugin Guide for full walkthroughs.
Contributing
The most impactful contributions to AI-Hydro are new domain tools — knowledge that currently lives in papers and custom scripts, packaged so any AI agent can use it. High-priority areas include flood frequency analysis, sediment transport, groundwater modelling, remote sensing workflows (MODIS, Landsat, SAR), snow hydrology, and water quality.
You don't need to fork the core. Write a Python package, register an entry point, publish to PyPI. That's it.
- Contributing Guide — Development setup, code style, testing
- Plugin Guide — Step-by-step walkthroughs for both contribution paths
Citation
If you use aihydro-tools in your research, please cite:
@software{aihydro_tools_2026,
title = {aihydro-tools: Python MCP Server for AI-Automated
Hydrological Research},
author = {Galib, Mohammad and Merwade, Venkatesh},
year = {2026},
version = {1.6.0},
doi = {10.5281/zenodo.19597589},
url = {https://doi.org/10.5281/zenodo.19597589}
}
For the VS Code extension, cite:
@software{aihydro_extension_2026,
title = {AI-Hydro: An Open Platform for End-to-End AI-Automated
Hydrological Research (VS Code Extension)},
author = {Galib, Mohammad and Merwade, Venkatesh},
year = {2026},
version = {0.1.3},
doi = {10.5281/zenodo.19597664},
url = {https://doi.org/10.5281/zenodo.19597664}
}
Links
- Documentation: ai-hydro.github.io/AI-Hydro
- AI-Hydro Extension: github.com/AI-Hydro/AI-Hydro
- PyPI: pypi.org/project/aihydro-tools
- YouTube: AI-Hydro Channel
- X / Twitter: @aihydro
- Issues: github.com/AI-Hydro/aihydro-tools/issues
License
Apache 2.0 © 2026 Mohammad Galib
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aihydro_tools-1.7.0.tar.gz.
File metadata
- Download URL: aihydro_tools-1.7.0.tar.gz
- Upload date:
- Size: 280.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
897b69644a7fce5db64006e46a26622999f11b83070b267f3d48db1373890fb2
|
|
| MD5 |
7c57792badbc57fbd1c0324fd290ef3e
|
|
| BLAKE2b-256 |
25d99cbb44288757b39551d51b6147221e792935756d3cdc3789bc035798d1d5
|
File details
Details for the file aihydro_tools-1.7.0-py3-none-any.whl.
File metadata
- Download URL: aihydro_tools-1.7.0-py3-none-any.whl
- Upload date:
- Size: 286.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5412df71345fd254bdd5860ca5bff76acb4d77d382070d006280e6f2025062f0
|
|
| MD5 |
d5b6871346c58253e3c7abbaf09924fe
|
|
| BLAKE2b-256 |
265aa2d7e72937e492b8d299be7f1c6837023c8f410e8ad4e064d282e714a87c
|