Skip to main content

Open-source water data aggregation toolkit with AI-powered research methodology recommendations

Project description

AquaScope logo

AquaScope

Open-source Python toolkit for water data, hydrology, and agricultural water management — with an AI engine that recommends and auto-executes research methodologies.

CI PyPI version Python License: MIT Code style: ruff Tests

GitHub stars GitHub forks

Install · Examples · CLI · Features · Docs · Roadmap · Discussions


AquaScope unifies 15 global water-data APIs behind one Python schema, then layers a full scientific computing stack on top — from Bulletin 17C flood frequency to FAO-56 crop water requirements — wrapped in an AI engine that scores 26 research methodologies against your dataset and auto-executes 7 analysis pipelines. Validated against the CAMELS benchmark with 525 tests.


✨ What you can do

  • 🌊 Pull water data from USGS, FAO AQUASTAT, FAO WaPOR, GEMStat, EU WFD, Copernicus ERA5, Taiwan MOENV/WRA/Civil IoT/DataGov, Japan MLIT, Korea WAMIS, OpenMeteo, UN SDG 6, US Water Quality Portal — one unified Python API.
  • 📈 Run hydrological analyses — Bulletin 17C flood frequency (GEV / LP3 / Gumbel / non-stationary GEV / EMA), baseflow separation, rating curves, 22 hydrological signatures.
  • 🌾 Plan agricultural water — FAO-56 Penman-Monteith ET₀, crop water requirements for 20 crops, irrigation scheduling, soil water balance with auto-irrigation.
  • 🤖 Ask the AI engine — describe your goal in plain English and get a recommended methodology, scored against your dataset profile and auto-executed. LLM enhancement via OpenAI, Groq (free), HuggingFace (free), or local Ollama.
  • 📊 Visualise + report — 16 plot types, Q-Q / P-P diagnostics, Markdown / HTML reports with embedded figures, threshold alerts (WHO / EPA / EU WFD).
  • 🗺️ Spatial hydrology — DEM processing, D8 flow direction, watershed delineation, Strahler ordering.

For the full capability list see docs/features.md.

📊 Why AquaScope

AquaScope HEC-SSP R lmom Standalone collectors
Bulletin 17C FFA + EMA partial
Non-stationary GEV partial
Baseflow separation (Lyne-Hollick, Eckhardt)
FAO-56 Penman-Monteith ET₀ + crop water
15 unified data collectors per-source
AI methodology recommender (OpenAI / Groq / HF / Ollama)
Interactive Streamlit dashboard
Free, MIT, Python-native partial varies

⚡ Install

pip install aquascope              # core — collectors + hydrology
pip install "aquascope[all]"       # everything — ML, viz, spatial, dashboard

Feature-group extras:

pip install "aquascope[ml]"           # sklearn, xgboost, statsmodels
pip install "aquascope[viz]"          # matplotlib, seaborn, folium
pip install "aquascope[scientific]"   # xarray, netcdf4, h5py
pip install "aquascope[spatial]"      # rasterio, geopandas, shapely
pip install "aquascope[dashboard]"    # streamlit
pip install "aquascope[forecast]"     # prophet, torch (for LSTM)

For development:

git clone https://github.com/Rekin226/aquascope.git
cd aquascope
pip install -e ".[all,dev]"

🚀 Examples

1. Flood frequency analysis (Bulletin 17C)

from aquascope.api import flood_analysis

result = flood_analysis(daily_discharge, method="gev", return_periods=[10, 50, 100])
print(result.return_levels)
#   return_period  return_level  lower_ci  upper_ci
# 0           10        1840.2     1690.4    2010.6
# 1           50        2530.7     2280.1    2820.9
# 2          100        2870.4     2540.6    3260.5

Switch method to "lp3", "gumbel", "gpd", or "ns_gev" for non-stationary analysis. Pass censored=True for EMA on records with peak-over-threshold gaps.

2. Baseflow separation + hydrological signatures

from aquascope.api import baseflow_analysis, compute_all_signatures

bf  = baseflow_analysis(daily_discharge, method="eckhardt")   # or "lyne_hollick"
sig = compute_all_signatures(daily_discharge)

print(bf.bfi)                  # baseflow index, e.g. 0.42
print(sig["Q5"], sig["Q95"])   # high-flow / low-flow exceedances
print(sig["flashiness"])       # Richards-Baker flashiness index

22 signatures across magnitude, variability, timing, recession, and flashiness — see docs/features.md.

3. Collect data from any of the 12 sources

from aquascope.collectors import USGSCollector, AquastatCollector, WaporCollector

usgs = USGSCollector()
flow = usgs.collect(station_id="01646500", parameter="00060", days=365)

aquastat = AquastatCollector()
egy_water = aquastat.collect(country="EGY", variables=[4263, 4253, 4312])

wapor = WaporCollector()
et = wapor.collect(
    bbox=(30.5, 29.8, 31.1, 30.2),
    variable="RET",
    start_date="2026-04-01",
    end_date="2026-07-31",
)

Every collector returns records in the same Pydantic schema, so downstream analyses don't care where the data came from. See docs/data_sources.md for the full list.

4. FAO-56 crop water requirements + soil water balance

from datetime import date
from aquascope.agri import (
    penman_monteith_daily,
    crop_water_requirement,
    SoilWaterBalance,
)
from aquascope.agri.water_balance import SoilProperties

# Reference ET (FAO-56 Penman-Monteith) — Cairo, July
eto = penman_monteith_daily(
    t_min=18.0, t_max=32.0, rh_min=40, rh_max=80,
    u2=2.0, rs=22.0, latitude=30.0, elevation=70, doy=180,
)

# Crop water requirement for maize from planting through harvest
cwr = crop_water_requirement(eto_series, crop="maize", planting_date=date(2026, 4, 1))

# Soil water balance with auto-irrigation triggers
soil    = SoilProperties(field_capacity=0.30, wilting_point=0.15, root_depth=1.0)
balance = SoilWaterBalance(soil).auto_irrigate(
    etc=cwr.etc, precip=precip_series, efficiency=0.7,
)
print(balance.total_irrigation_mm, balance.deficit_days)

5. AI methodology recommender

from aquascope.ai_engine import recommend

# Describe your dataset and goal — get ranked, scored methodologies
recs = recommend(
    parameters=["DO", "BOD5", "COD"],
    n_records=4_500,
    temporal=True,
    spatial=False,
    goal="detect long-term pollution trends with seasonality",
)

for r in recs[:3]:
    print(f"{r.score:.2f}  {r.method_id:<20}  {r.rationale}")
# 0.92  mann_kendall          Strong fit: temporal data, >30 records, trend goal
# 0.87  stl_decomposition     Seasonal patterns + multi-year data
# 0.81  prophet               Forecasting-capable, handles seasonality natively

Then auto-execute the top result with run_pipeline(recs[0].method_id, df).

6. Change-point detection + copula dependence

from aquascope.api import detect_changepoints, fit_copula

cps  = detect_changepoints(annual_runoff, method="pettitt")
cop  = fit_copula(rainfall, runoff, family="auto")    # AIC-selects Gaussian/Clayton/Gumbel/Frank
print(cps.change_year, cps.p_value)
print(cop.family, cop.theta, cop.aic)

7. Bayesian regression with uncertainty quantification

from aquascope.api import bayesian_regression

# Annual rainfall → runoff with full posterior + convergence diagnostics
posterior = bayesian_regression(X=annual_precip, y=annual_runoff)

print(posterior.posterior_mean)
# {'beta_0': 12.4, 'beta_1': 0.82, 'sigma2': 41.6}

print(posterior.credible_intervals["beta_1"])
# (0.78, 0.86)        ← 95% credible interval on slope

print(posterior.r_hat)
# {'beta_0': 1.00, 'beta_1': 1.00, 'sigma2': 1.00}    ← Gelman–Rubin, converged

print(posterior.dic, posterior.effective_sample_size["beta_1"])
# 124.7  9842.0       ← model fit + effective sample size

Switch to MCMC with degree>1 for polynomial models, or pass prior_precision for informative priors. Conjugate linear, polynomial, and Metropolis-Hastings backends are all available.


💻 CLI

AquaScope ships a 14-command CLI for the most common workflows:

# Collect data
aquascope collect --source usgs --station 01646500 --days 365
aquascope collect --source wapor --bbox 30.5,29.8,31.1,30.2 --variable RET --start-date 2026-04-01

# Hydrological analysis
aquascope hydro --method flood_frequency --file discharge.csv
aquascope hydro --method baseflow --file discharge.csv

# Agriculture planning
aquascope agri plan --crop maize --planting-date 2026-04-01 --lat 30.0 --lon 31.25

# AI recommendation + natural-language problem solving
aquascope recommend --parameters DO,BOD5,COD --goal "pollution trend detection"
aquascope solve --problem "Assess flood risk for a 100-year return period"

# Interactive Streamlit dashboard
aquascope dashboard

Run aquascope --help for the full command list.


🌍 Data sources at a glance

12 unified data sources spanning four regions:

  • 🌎 Americas — USGS (streamflow + WQ), Water Quality Portal (400+ agencies)
  • 🌍 Europe — EU Water Framework Directive, Copernicus ERA5
  • 🌏 Asia-Pacific — Taiwan MOENV / WRA / Civil IoT / DataGov, Japan MLIT, Korea WAMIS
  • 🌐 Global — GEMStat (170 countries), UN SDG 6, OpenMeteo, FAO AQUASTAT, FAO WaPOR

Full details, endpoints, and API-key requirements: docs/data_sources.md. Want to add your country's water service? See adding a data source.


🧪 Scientifically validated

  • 534 tests — covering every collector, hydrology method, and pipeline (525 passing in the core suite; spatial tests require rasterio)
  • CAMELS benchmark — a 10-catchment validation subset of the CAMELS dataset ships with the repo at data/camels_benchmark/ and runs as part of CI
  • Every method cited — equations, decision trees, and DOI references for all 26 methodologies live in the theory guide
  • JOSS paper in submission — see paper.md and paper.bib

📚 Documentation

Resource What it covers
Features Full capability list — hydrology, agriculture, ML, spatial, I/O
Data sources All 12 sources, endpoints, API-key requirements
Theory guide Equations, DOI citations, decision trees for every method
Methodology matrix When to use which method
Architecture How AquaScope is structured internally
FAQ · Troubleshooting Common questions and fixes
Use cases Real-world applications and case studies
Integration guides xarray, QGIS, R interoperability
Contributing How to add a data source, methodology, or test

🤝 Contributing

We welcome contributions from the global water and agriculture research community. Highest-impact contributions right now:

  • New data source collectors — your country / region
  • New research methodologies — expand the AI recommender
  • New crop coefficients — extend the FAO Kc table
  • Jupyter tutorials and validation studies — compare against HEC-SSP, R packages, etc.

See CONTRIBUTING.md, the adding a data source guide, and the adding a methodology guide.

📜 Citation

If you use AquaScope in your research, please cite:

@software{aquascope2026,
  title   = {AquaScope: Open-Source Water Data Aggregation, Hydrological Analysis, and Agricultural Water Management Toolkit},
  author  = {AquaScope Contributors},
  year    = {2026},
  url     = {https://github.com/Rekin226/aquascope},
  version = {0.5.0},
  license = {MIT}
}

📄 License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aquascope-0.5.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aquascope-0.5.0-py3-none-any.whl (311.7 kB view details)

Uploaded Python 3

File details

Details for the file aquascope-0.5.0.tar.gz.

File metadata

  • Download URL: aquascope-0.5.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aquascope-0.5.0.tar.gz
Algorithm Hash digest
SHA256 a4754077e00a36ed8b7e4bfc7c1872d09c91b81c23bbe21ce07f3fee04bdc1cb
MD5 f197d3c00be2204791cc058cf65f2efb
BLAKE2b-256 dcc805e4faa234adc80a9f1526e7e086a39cfab290ea19d39b13e10bbb319704

See more details on using hashes here.

Provenance

The following attestation bundles were made for aquascope-0.5.0.tar.gz:

Publisher: publish.yml on Rekin226/aquascope

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aquascope-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: aquascope-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 311.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aquascope-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 703af4f1ee9d31ff119f6504148c4d8ea909a358bd3fcb61f973831bca4b970b
MD5 99b183c919cb49cf5fc2c35022459d95
BLAKE2b-256 d9003c89d7cf2b9832c12d6c0f3da61543eccafc328129159c14913d5b6561e7

See more details on using hashes here.

Provenance

The following attestation bundles were made for aquascope-0.5.0-py3-none-any.whl:

Publisher: publish.yml on Rekin226/aquascope

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page