AI-Hydro: hydrological research tools as an MCP server for AI agents

These details have not been verified by PyPI

Project links

Project description

aihydro-tools

AI-Hydro — Intelligent Hydrological Computing

Stop writing plumbing. Give AI agents real hydrological superpowers.

What is aihydro-tools?

aihydro-tools is the Python backbone of the AI-Hydro platform. It turns a conversation with an AI agent into real hydrological computation — watershed delineation, streamflow retrieval, signature extraction, terrain analysis, and model calibration — with full structured provenance recorded automatically at every step.

144 validated, tiered tools are exposed via the Model Context Protocol (MCP), the open standard for agent-tool communication. Any AI model that supports MCP — Claude, GPT, Gemini — can call these tools directly, without writing a single line of processing code. And because aihydro-tools is built as a community platform, any researcher can register domain-specific tools (flood frequency, sediment transport, groundwater, remote sensing) via Python entry points, extending the ecosystem without touching the core.

Quick Start

# Install
pip install aihydro-tools[all]

# Verify
aihydro-mcp --diagnose

# Run the server
aihydro-mcp

The AI-Hydro VS Code extension auto-detects aihydro-mcp on startup — no manual configuration needed.

Built-in Tools

Analysis Tools

Category	Tool	Description
Watershed	`delineate_watershed`	NHDPlus watershed delineation from USGS NLDI given a gauge ID
Streamflow	`fetch_streamflow_data`	Daily discharge time series from USGS NWIS
Signatures	`extract_hydrological_signatures`	15+ flow statistics: BFI, runoff ratio, FDC percentiles, recession constants
Geomorphic	`extract_geomorphic_parameters`	28 basin morphometry metrics (area, slope, elevation, shape factors)
Terrain	`compute_twi`	Topographic Wetness Index from 3DEP 10m DEM
Curve Number	`create_cn_grid`	NRCS Curve Number grid from NLCD land cover + Polaris soils
Forcing	`fetch_forcing_data`	Basin-averaged GridMET climate data (prcp, tmax, tmin, PET, srad, wind)
CAMELS	`extract_camels_attributes`	Full CAMELS-US attribute set (671 gauges) via pygeohydro
Modelling	`train_hydro_model`	Differentiable HBV-light or NeuralHydrology zoo (LSTM · EA-LSTM · hybrid · forecast) with bootstrap CI
Modelling	`get_model_results`	Retrieve cached model performance (NSE, KGE, RMSE)
Modelling	`describe_model_space`	Introspect the full knob schema + per-backend availability
Modelling	`propose_and_train`	Validate a `ModelSpec` and train returning provenance-stamped `HydroResult`
Modelling	`run_autoresearch`	Run a CI-aware autoresearch episode (propose → train → paired-difference CI → keep/discard)
Modelling	`get_leaderboard`	Retrieve the autoresearch leaderboard with defensibility-annotated episode summary
Session	`start_session`	Initialize or resume a per-gauge research session
Session	`get_session_summary`	Overview of computed and pending analysis slots
Session	`clear_session`	Reset cached results to force re-computation
Session	`add_note`	Attach research notes to the session
Session	`export_session`	Export a reproducible research capsule with data, figures, methods, and environment
Session	`get_session_raw_state`	Retrieve raw computed results for LLM interpretation (Phase 1 of two-phase split)
Session	`write_research_interpretation`	Store LLM-authored scientific interpretation (Phase 2 of two-phase split)
Session	`archive_session`	Archive completed session to a timestamped ZIP
Session	`merge_session_shards`	Merge parallel sub-agent shards into the main session
Ledger	`add_claim`	Add a scoped scientific claim to the session ledger
Ledger	`update_claim_status`	Update the status/confidence of an existing claim
Ledger	`list_claims`	List all claims in the session (filterable by status)
Ledger	`add_assumption`	Record a scientific assumption or caveat
Ledger	`list_assumptions`	List all assumptions in the session
Ledger	`promote_claim_to_registry`	Promote a validated claim to the global knowledge registry
Ledger	`draft_claim_from_run`	Auto-draft a claim pre-filled with evidence from a Tier 1 tool run
Validators	`check_water_balance_consistency`	Flag mass-balance violations in signature + streamflow data
Validators	`check_temporal_alignment`	Verify forcing and streamflow cover the same time window
Validators	`check_unit_consistency`	Confirm a session slot carries the expected physical units
Visualization	`show_on_map`	Push any GeoJSON geometry onto the AI-Hydro map panel
Discover	`list_available_tools`	Enumerate all installed tools, including community plugins
Discover	`list_skills`	List available workflow playbooks by domain
Discover	`load_skill`	Load a workflow playbook for a multi-step analysis
Discover	`get_library_reference`	Retrieve an API idiom card for a Python library
Discover	`list_relevant_clis`	List relevant external CLI tools
Discover	`get_variable_definition`	Look up a hydrology variable by ID (units, aliases, notes)
Discover	`get_metric_definition`	Look up a performance metric by ID
Discover	`get_dataset_definition`	Look up a dataset by ID (provider, resolution, variables)

Project, Literature & Researcher Memory

Category	Tool	Description
Project	`start_project`	Create or resume a named research project spanning multiple gauges or topics
Project	`get_project_summary`	Overview of all gauges, journal entries, literature, and metrics in a project
Project	`add_gauge_to_project`	Associate a gauge session with the active project
Project	`search_experiments`	Full-text search across all gauge sessions in a project
Literature	`index_literature`	Scan a folder of PDF, txt, or md files and build a searchable text index
Literature	`search_literature`	Query the index and return excerpts for the agent to synthesise
Literature	`add_journal_entry`	Log a timestamped experiment note to the project journal
Persona	`get_researcher_profile`	Retrieve the persistent researcher profile (expertise, focus, preferences)
Persona	`update_researcher_profile`	Update profile fields — agent or researcher driven
Persona	`log_researcher_observation`	Record an observation about the researcher's evolving interests and methods

Community plugins can add further tools via the entry-point system (see below).

Data Sources

All data is fetched from authoritative federal sources:

USGS NWIS — daily streamflow via dataretrieval (official USGS Python client)
NHDPlus / NLDI — watershed delineation via pynhd
GridMET — climate forcing via pygridmet
3DEP — DEM and terrain analysis via py3dep
NLCD — land cover classification
POLARIS — soil properties
CAMELS-US — catchment attributes via pygeohydro

Memory & Provenance

AI-Hydro maintains a three-tier memory hierarchy so research context survives between conversations, sessions, and projects.

HydroSession — per-gauge state at ~/.aihydro/sessions/<gauge_id>.json. Expensive computations (watershed delineation, multi-year streamflow downloads) are done once and reused across days or weeks. Every result carries structured provenance metadata — data source, parameters, timestamp — making reproducibility a natural byproduct rather than a documentation chore.

ProjectSession — project-scoped state at ~/.aihydro/projects/<name>/project.json. Organises research across multiple gauges, topics, or datasets. Supports cross-session experiment search, a timestamped journal, and literature indexing.

ResearcherProfile — a persistent persona at ~/.aihydro/researcher.json. Built up from agent-researcher interactions over time: expertise areas, preferred models, active projects, and accumulated observations. Injected into every conversation automatically so the agent knows who it is working with.

Scientific Trust

AI-Hydro goes beyond computation — it records why results should be believed and what remains uncertain.

Tool Tier System

Every tool is assigned to one of three evidence tiers (machine-readable via get_tool_tier(name)):

Tier	Label	Automatic enforcement
1	Scientific output	`quality_flags` injected into every result; `_run_id` minted for evidence binding
2	Workflow / data	No automatic enforcement; best-effort provenance
3	Infrastructure	No validation load; session plumbing only

Tier 1 tools (watershed delineation, signatures, TWI, CN, model training) fire registered post-run validators automatically. Every Tier 1 result carries:

quality_flags — list of validator outcomes (pass/warning/fail with severity)
_run_id — stable evidence-binding key for linking claims to specific runs

Scientific Ledger

Results become actionable knowledge through the ledger:

draft_claim_from_run(session_id, run_id, metric_ref) — reads the run log for any Tier 1 tool call and returns a claim template with evidence_spans pre-populated. The agent authors only the scientific interpretation.
add_claim(..., evidence_spans=[...]) — records the claim. EvidenceSpan ties the claim to a run, paper, or dataset with typed attribution (source_type, source_id, metric_ref).
promote_claim_to_registry(..., researcher_approved=True) — passes a promotion gate: at least one evidence_span, at least one limitation, and status supported or weakly_supported. Researcher approval is required.

Verified Knowledge

Built-in knowledge entries can be marked verified: true in the YAML registry (e.g., metric.kge, variable.streamflow, dataset.usgs_nwis). Verified entries require scientific_justification — not just overrides + override_reason — in workspace override files, ensuring overrides of peer-reviewed conventions are deliberate and documented.

Retrieve all verified entries programmatically:

from ai_hydro.knowledge.loader import get_verified_knowledge
verified = get_verified_knowledge()  # {"variables": [...], "metrics": [...], "datasets": [...]}

Quality Assurance

Automatic Post-Run Validation

Tier 1 tools fire registered validators automatically (no agent action needed). Currently active wiring:

extract_hydrological_signatures → check_water_balance_consistency
fetch_streamflow_data → check_unit_consistency (expected: m³/s)

Validators never raise — failures appear in quality_flags without crashing the tool. Register additional validators via:

from ai_hydro.mcp.enforcement import register_post_validator
register_post_validator("my_tool", my_validator_fn, lambda sid: {"session_id": sid})

aihydro-bench

A deterministic fixture benchmark suite (bench/tasks.yaml, ~26 tasks) verifies every core computation path without live network calls. Tasks span all tiers:

Group A–C: validators, compute functions, and their edge cases
Group D–G: ledger gates, knowledge registry guards, conflict resolution
Group H–J: enforcement layer, verified namespace, claim coupling (draft → add)

Run it:

pytest tests/test_bench.py -m bench -v       # fast fixture suite (no network)
pytest tests/test_bench.py -m bench_live -v  # live USGS calls (nightly CI only)

Extending with Plugins

aihydro-tools is a platform, not a closed product. Any researcher can package domain knowledge as a plugin and make it immediately available to every AI agent that uses AI-Hydro — flood frequency analysis, sediment transport, groundwater modelling, remote sensing workflows, or anything else the core doesn't yet cover.

Entry-point plugins load into the same process with full access to HydroSession and cached data:

# In your package's pyproject.toml
[project.entry-points."aihydro.tools"]
my_tool = "my_package.tools:my_tool_function"

Install the package, restart the server, and the tool is automatically discovered — no changes to the core required.

Standalone MCP servers let you build fully independent toolkits with their own dependencies, registered alongside the core ai-hydro server.

See the Plugin Guide for complete walkthroughs of both paths, the data contract, and session integration.

Use as a Python Library

You don't need an AI agent to benefit from aihydro-tools. Every tool is a regular Python function — import and call directly in scripts, notebooks, or pipelines:

from ai_hydro.analysis.watershed import delineate_watershed
from ai_hydro.data.streamflow import fetch_streamflow_data
from ai_hydro.analysis.signatures import extract_hydrological_signatures

# Delineate a watershed
ws = delineate_watershed("01031500")
print(f"Watershed area: {ws.data['area_km2']} km2")

# Fetch streamflow
sf = fetch_streamflow_data("01031500", start_date="2015-01-01", end_date="2024-12-31")
print(f"Records: {len(sf.data['dates'])} days")

# Extract signatures
sigs = extract_hydrological_signatures("01031500")
print(f"Baseflow index: {sigs.data['baseflow_index']}")

All functions return HydroResult objects with .data (dict) and .meta (provenance metadata).

Installation Details

Extras

Install only what you need:

Extra	What it adds	Install command
`[data]`	Streamflow, forcing, land cover, soil, CAMELS retrieval	`pip install aihydro-tools[data]`
`[analysis]`	Watershed, signatures, TWI, geomorphic, curve number	`pip install aihydro-tools[analysis]`
`[modelling]`	HBV-light, NeuralHydrology zoo (LSTM · EA-LSTM · hybrid · forecast), CI-aware autoresearch loop, defensibility scoring	`pip install aihydro-tools[modelling]`
`[viz]`	Matplotlib, Plotly, Folium visualisations	`pip install aihydro-tools[viz]`
`[all]`	Everything above	`pip install aihydro-tools[all]`

PATH Troubleshooting

If aihydro-mcp is not found after install, pip placed it outside your PATH:

OS	Typical location
Windows (user)	`%APPDATA%\Python\Python3XX\Scripts\aihydro-mcp.exe`
Windows (system)	`C:\Python3XX\Scripts\aihydro-mcp.exe`
macOS/Linux (user)	`~/.local/bin/aihydro-mcp`
macOS/Linux (system)	`/usr/local/bin/aihydro-mcp`
Conda	`~/miniconda3/bin/aihydro-mcp` or `~/anaconda3/bin/aihydro-mcp`

Universal fallback: python -m ai_hydro.mcp works regardless of PATH. The AI-Hydro extension auto-detects both the console script and the module fallback.

Extending with Plugins

AI-Hydro uses Python entry points for a clean plugin system. Community packages can contribute any of four capability layers without modifying the core:

Entry-point group	Contributes	Served by
`aihydro.tools`	MCP tool functions	`list_available_tools()`
`aihydro.knowledge`	Library reference cards (JSON)	`get_library_reference()`
`aihydro.skills`	Workflow playbooks (SKILL.md)	`list_skills()` / `load_skill()`
`aihydro.clis`	CLI descriptor (binary + help)	`list_relevant_clis()`

Example: add a custom tool

# my_hydro_pkg/tools.py
from ai_hydro.core.types import HydroResult, HydroMeta, DataSource

def compute_soil_moisture(session_id: str, workspace_dir: str = None) -> dict:
    """Estimate soil moisture from session forcing data."""
    # ... your computation ...
    result = HydroResult(data={...}, meta=HydroMeta(tool="compute_soil_moisture", ...))
    return result.to_dict()

# pyproject.toml
[project.entry-points."aihydro.tools"]
compute_soil_moisture = "my_hydro_pkg.tools:compute_soil_moisture"

After pip install my-hydro-pkg, the tool appears automatically in list_available_tools() on the next MCP server restart — no changes to aihydro-tools required.

Example: add a knowledge card

[project.entry-points."aihydro.knowledge"]
my_lib = "my_hydro_pkg.knowledge:get_refs_dir"

where get_refs_dir() returns a Path to a directory of *.json cards (same schema as the built-in cards in ai_hydro/knowledge/library_refs/).

Example: add a workflow skill

[project.entry-points."aihydro.skills"]
my_skills = "my_hydro_pkg.skills:get_skills_dir"

where get_skills_dir() returns a Path to a directory of *.md skill files (YAML frontmatter + markdown body). Skills appear in list_skills() and are loadable via load_skill(name).

Example: advertise a CLI

[project.entry-points."aihydro.clis"]
my_tool = "my_hydro_pkg.aihydro.cli_descriptor:descriptor"

where descriptor() returns {name, binary, description, help_subcommand}. The CLI appears in list_relevant_clis().

See the Plugin Guide for full walkthroughs.

Contributing

The most impactful contributions to AI-Hydro are new domain tools — knowledge that currently lives in papers and custom scripts, packaged so any AI agent can use it. High-priority areas include flood frequency analysis, sediment transport, groundwater modelling, remote sensing workflows (MODIS, Landsat, SAR), snow hydrology, and water quality.

You don't need to fork the core. Write a Python package, register an entry point, publish to PyPI. That's it.

Contributing Guide — Development setup, code style, testing
Plugin Guide — Step-by-step walkthroughs for both contribution paths

Citation

If you use aihydro-tools in your research, please cite:

@software{aihydro_tools_2026,
  title   = {aihydro-tools: Python MCP Server for AI-Automated
             Hydrological Research},
  author  = {Galib, Mohammad and Merwade, Venkatesh},
  year    = {2026},
  version = {1.6.0},
  doi     = {10.5281/zenodo.19597589},
  url     = {https://doi.org/10.5281/zenodo.19597589}
}

For the VS Code extension, cite:

@software{aihydro_extension_2026,
  title   = {AI-Hydro: An Open Platform for End-to-End AI-Automated
             Hydrological Research (VS Code Extension)},
  author  = {Galib, Mohammad and Merwade, Venkatesh},
  year    = {2026},
  version = {0.1.3},
  doi     = {10.5281/zenodo.19597664},
  url     = {https://doi.org/10.5281/zenodo.19597664}
}

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.1.0

Jun 25, 2026

2.0.0

Jun 16, 2026

1.7.0

May 25, 2026

1.5.0

Apr 19, 2026

1.4.0

Apr 17, 2026

1.2.0

Apr 11, 2026

1.1.0

Apr 9, 2026

1.0.5

Apr 9, 2026

1.0.4

Apr 9, 2026

1.0.3

Apr 9, 2026

1.0.2

Apr 9, 2026

1.0.1

Apr 9, 2026

1.0.0

Apr 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aihydro_tools-2.1.0.tar.gz (500.3 kB view details)

Uploaded Jun 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aihydro_tools-2.1.0-py3-none-any.whl (447.9 kB view details)

Uploaded Jun 25, 2026 Python 3

File details

Details for the file aihydro_tools-2.1.0.tar.gz.

File metadata

Download URL: aihydro_tools-2.1.0.tar.gz
Upload date: Jun 25, 2026
Size: 500.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for aihydro_tools-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ca0433a5105875d00fa58aaa24597fcf58fc768cd7a99c5b6c33ab126cd536f3`
MD5	`023e4eaa69d997de4ee05e860dc3acd4`
BLAKE2b-256	`450666540cd1780fd26d4f06ba77a146b7fade2325ec700822021abdf24e4735`

See more details on using hashes here.

File details

Details for the file aihydro_tools-2.1.0-py3-none-any.whl.

File metadata

Download URL: aihydro_tools-2.1.0-py3-none-any.whl
Upload date: Jun 25, 2026
Size: 447.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for aihydro_tools-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a8ed5437c4f4b5e95482e553dec435f0e1667e6f90c9fc8d9497641148f7bbd4`
MD5	`ab14bc5783ff04e59d616e93024cdbec`
BLAKE2b-256	`2bc36db2e6ca86e9b3ecbd450e29df4f05b1dfe237cd4a71f96bfebe3138f62a`

See more details on using hashes here.

aihydro-tools 2.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

aihydro-tools

What is aihydro-tools?

Quick Start

Built-in Tools

Analysis Tools

Project, Literature & Researcher Memory

Data Sources

Memory & Provenance

Scientific Trust

Tool Tier System

Scientific Ledger

Verified Knowledge

Quality Assurance

Automatic Post-Run Validation

aihydro-bench

Extending with Plugins

Use as a Python Library

Installation Details

Extras

PATH Troubleshooting

Extending with Plugins

Example: add a custom tool

Example: add a knowledge card

Example: add a workflow skill

Example: advertise a CLI

Contributing

Citation

Links

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes