Open-source water data aggregation toolkit with AI-powered research methodology recommendations
Project description
AquaScope
Open-source Python toolkit for water data, hydrology, and agricultural water management — with an AI engine that recommends and auto-executes research methodologies.
Install · Examples · CLI · Features · Docs · Roadmap · Discussions
AquaScope unifies 12 global water-data APIs behind one Python schema, then layers a full scientific computing stack on top — from Bulletin 17C flood frequency to FAO-56 crop water requirements — wrapped in an AI engine that scores 26 research methodologies against your dataset and auto-executes 7 analysis pipelines. Validated against the CAMELS benchmark with 534 tests.
✨ What you can do
- 🌊 Pull water data from USGS, FAO AQUASTAT, FAO WaPOR, GEMStat, EU WFD, Copernicus ERA5, Taiwan MOENV/WRA, Japan MLIT, Korea WAMIS, OpenMeteo, UN SDG 6 — one unified Python API.
- 📈 Run hydrological analyses — Bulletin 17C flood frequency (GEV / LP3 / Gumbel / non-stationary GEV / EMA), baseflow separation, rating curves, 22 hydrological signatures.
- 🌾 Plan agricultural water — FAO-56 Penman-Monteith ET₀, crop water requirements for 20 crops, irrigation scheduling, soil water balance with auto-irrigation.
- 🤖 Ask the AI engine — describe your goal in plain English and get a recommended methodology, scored against your dataset profile and auto-executed.
- 📊 Visualise + report — 16 plot types, Q-Q / P-P diagnostics, Markdown / HTML reports with embedded figures, threshold alerts (WHO / EPA / EU WFD).
- 🗺️ Spatial hydrology — DEM processing, D8 flow direction, watershed delineation, Strahler ordering.
For the full capability list see docs/features.md.
📊 Why AquaScope
| AquaScope | HEC-SSP | R lmom |
Standalone collectors | |
|---|---|---|---|---|
| Bulletin 17C FFA + EMA | ✅ | ✅ | partial | — |
| Non-stationary GEV | ✅ | — | partial | — |
| Baseflow separation (Lyne-Hollick, Eckhardt) | ✅ | — | — | — |
| FAO-56 Penman-Monteith ET₀ + crop water | ✅ | — | — | — |
| 12 unified data collectors | ✅ | — | — | per-source |
| AI methodology recommender | ✅ | — | — | — |
| Interactive Streamlit dashboard | ✅ | — | — | — |
| Free, MIT, Python-native | ✅ | partial | ✅ | varies |
⚡ Install
pip install aquascope # core — collectors + hydrology
pip install "aquascope[all]" # everything — ML, viz, spatial, dashboard
Feature-group extras:
pip install "aquascope[ml]" # sklearn, xgboost, statsmodels
pip install "aquascope[viz]" # matplotlib, seaborn, folium
pip install "aquascope[scientific]" # xarray, netcdf4, h5py
pip install "aquascope[spatial]" # rasterio, geopandas, shapely
pip install "aquascope[dashboard]" # streamlit
pip install "aquascope[forecast]" # prophet, torch (for LSTM)
For development:
git clone https://github.com/Rekin226/aquascope.git
cd aquascope
pip install -e ".[all,dev]"
🚀 Examples
1. Flood frequency analysis (Bulletin 17C)
from aquascope.api import flood_analysis
result = flood_analysis(daily_discharge, method="gev", return_periods=[10, 50, 100])
print(result.return_levels)
# return_period return_level lower_ci upper_ci
# 0 10 1840.2 1690.4 2010.6
# 1 50 2530.7 2280.1 2820.9
# 2 100 2870.4 2540.6 3260.5
Switch method to "lp3", "gumbel", "gpd", or "ns_gev" for non-stationary analysis. Pass censored=True for EMA on records with peak-over-threshold gaps.
2. Baseflow separation + hydrological signatures
from aquascope.api import baseflow_analysis, compute_all_signatures
bf = baseflow_analysis(daily_discharge, method="eckhardt") # or "lyne_hollick"
sig = compute_all_signatures(daily_discharge)
print(bf.bfi) # baseflow index, e.g. 0.42
print(sig["Q5"], sig["Q95"]) # high-flow / low-flow exceedances
print(sig["flashiness"]) # Richards-Baker flashiness index
22 signatures across magnitude, variability, timing, recession, and flashiness — see docs/features.md.
3. Collect data from any of the 12 sources
from aquascope.collectors import USGSCollector, AquastatCollector, WaporCollector
usgs = USGSCollector()
flow = usgs.collect(station_id="01646500", parameter="00060", days=365)
aquastat = AquastatCollector()
egy_water = aquastat.collect(country="EGY", variables=[4263, 4253, 4312])
wapor = WaporCollector()
et = wapor.collect(
bbox=(30.5, 29.8, 31.1, 30.2),
variable="RET",
start_date="2026-04-01",
end_date="2026-07-31",
)
Every collector returns records in the same Pydantic schema, so downstream analyses don't care where the data came from. See docs/data_sources.md for the full list.
4. FAO-56 crop water requirements + soil water balance
from datetime import date
from aquascope.agri import (
penman_monteith_daily,
crop_water_requirement,
SoilWaterBalance,
)
from aquascope.agri.water_balance import SoilProperties
# Reference ET (FAO-56 Penman-Monteith) — Cairo, July
eto = penman_monteith_daily(
t_min=18.0, t_max=32.0, rh_min=40, rh_max=80,
u2=2.0, rs=22.0, latitude=30.0, elevation=70, doy=180,
)
# Crop water requirement for maize from planting through harvest
cwr = crop_water_requirement(eto_series, crop="maize", planting_date=date(2026, 4, 1))
# Soil water balance with auto-irrigation triggers
soil = SoilProperties(field_capacity=0.30, wilting_point=0.15, root_depth=1.0)
balance = SoilWaterBalance(soil).auto_irrigate(
etc=cwr.etc, precip=precip_series, efficiency=0.7,
)
print(balance.total_irrigation_mm, balance.deficit_days)
5. AI methodology recommender
from aquascope.ai_engine import recommend
# Describe your dataset and goal — get ranked, scored methodologies
recs = recommend(
parameters=["DO", "BOD5", "COD"],
n_records=4_500,
temporal=True,
spatial=False,
goal="detect long-term pollution trends with seasonality",
)
for r in recs[:3]:
print(f"{r.score:.2f} {r.method_id:<20} {r.rationale}")
# 0.92 mann_kendall Strong fit: temporal data, >30 records, trend goal
# 0.87 stl_decomposition Seasonal patterns + multi-year data
# 0.81 prophet Forecasting-capable, handles seasonality natively
Then auto-execute the top result with run_pipeline(recs[0].method_id, df).
6. Change-point detection + copula dependence
from aquascope.api import detect_changepoints, fit_copula
cps = detect_changepoints(annual_runoff, method="pettitt")
cop = fit_copula(rainfall, runoff, family="auto") # AIC-selects Gaussian/Clayton/Gumbel/Frank
print(cps.change_year, cps.p_value)
print(cop.family, cop.theta, cop.aic)
7. Bayesian regression with uncertainty quantification
from aquascope.api import bayesian_regression
# Annual rainfall → runoff with full posterior + convergence diagnostics
posterior = bayesian_regression(X=annual_precip, y=annual_runoff)
print(posterior.posterior_mean)
# {'beta_0': 12.4, 'beta_1': 0.82, 'sigma2': 41.6}
print(posterior.credible_intervals["beta_1"])
# (0.78, 0.86) ← 95% credible interval on slope
print(posterior.r_hat)
# {'beta_0': 1.00, 'beta_1': 1.00, 'sigma2': 1.00} ← Gelman–Rubin, converged
print(posterior.dic, posterior.effective_sample_size["beta_1"])
# 124.7 9842.0 ← model fit + effective sample size
Switch to MCMC with degree>1 for polynomial models, or pass prior_precision for informative priors. Conjugate linear, polynomial, and Metropolis-Hastings backends are all available.
💻 CLI
AquaScope ships a 14-command CLI for the most common workflows:
# Collect data
aquascope collect --source usgs --station 01646500 --days 365
aquascope collect --source wapor --bbox 30.5,29.8,31.1,30.2 --variable RET --start-date 2026-04-01
# Hydrological analysis
aquascope hydro --method flood_frequency --file discharge.csv
aquascope hydro --method baseflow --file discharge.csv
# Agriculture planning
aquascope agri plan --crop maize --planting-date 2026-04-01 --lat 30.0 --lon 31.25
# AI recommendation + natural-language problem solving
aquascope recommend --parameters DO,BOD5,COD --goal "pollution trend detection"
aquascope solve --problem "Assess flood risk for a 100-year return period"
# Interactive Streamlit dashboard
aquascope dashboard
Run aquascope --help for the full command list.
🌍 Data sources at a glance
12 unified data sources spanning four regions:
- 🌎 Americas — USGS (streamflow + WQ), Water Quality Portal (400+ agencies)
- 🌍 Europe — EU Water Framework Directive, Copernicus ERA5
- 🌏 Asia-Pacific — Taiwan MOENV / WRA / Civil IoT, Japan MLIT, Korea WAMIS
- 🌐 Global — GEMStat (170 countries), UN SDG 6, OpenMeteo, FAO AQUASTAT, FAO WaPOR
Full details, endpoints, and API-key requirements: docs/data_sources.md. Want to add your country's water service? See adding a data source.
🧪 Scientifically validated
- 534 tests — covering every collector, hydrology method, and pipeline
- CAMELS benchmark — a 10-catchment validation subset of the CAMELS dataset ships with the repo at
data/camels_benchmark/and runs as part of CI - Every method cited — equations, decision trees, and DOI references for all 26 methodologies live in the theory guide
- JOSS paper in submission — see
paper.mdandpaper.bib
📚 Documentation
| Resource | What it covers |
|---|---|
| Features | Full capability list — hydrology, agriculture, ML, spatial, I/O |
| Data sources | All 12 sources, endpoints, API-key requirements |
| Theory guide | Equations, DOI citations, decision trees for every method |
| Methodology matrix | When to use which method |
| Architecture | How AquaScope is structured internally |
| FAQ · Troubleshooting | Common questions and fixes |
| Use cases | Real-world applications and case studies |
| Integration guides | xarray, QGIS, R interoperability |
| Contributing | How to add a data source, methodology, or test |
🤝 Contributing
We welcome contributions from the global water and agriculture research community. Highest-impact contributions right now:
- New data source collectors — your country / region
- New research methodologies — expand the AI recommender
- New crop coefficients — extend the FAO Kc table
- Jupyter tutorials and validation studies — compare against HEC-SSP, R packages, etc.
See CONTRIBUTING.md, the adding a data source guide, and the adding a methodology guide.
📜 Citation
If you use AquaScope in your research, please cite:
@software{aquascope2026,
title = {AquaScope: Open-Source Water Data Aggregation, Hydrological Analysis, and Agricultural Water Management Toolkit},
author = {AquaScope Contributors},
year = {2026},
url = {https://github.com/Rekin226/aquascope},
version = {0.4.0},
license = {MIT}
}
📄 License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aquascope-0.4.0.tar.gz.
File metadata
- Download URL: aquascope-0.4.0.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b588c860640ad0857c4f3d2d295e0b43f1b23ddfb0323f5379320fd4cacf37d
|
|
| MD5 |
18027e3cf8686d73851c6e4743be0537
|
|
| BLAKE2b-256 |
c6fd0f424ee2703868adbd93145792f0944910d010273164c8e3e5dc8138c442
|
Provenance
The following attestation bundles were made for aquascope-0.4.0.tar.gz:
Publisher:
publish.yml on Rekin226/aquascope
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aquascope-0.4.0.tar.gz -
Subject digest:
6b588c860640ad0857c4f3d2d295e0b43f1b23ddfb0323f5379320fd4cacf37d - Sigstore transparency entry: 1557815223
- Sigstore integration time:
-
Permalink:
Rekin226/aquascope@ad126ff359ffe92c7d239256f08e7a821cff31d5 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/Rekin226
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ad126ff359ffe92c7d239256f08e7a821cff31d5 -
Trigger Event:
release
-
Statement type:
File details
Details for the file aquascope-0.4.0-py3-none-any.whl.
File metadata
- Download URL: aquascope-0.4.0-py3-none-any.whl
- Upload date:
- Size: 300.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16fe232100ef5dbf64ec1a5f912b35e30fc2f6a8616dd19278122bf46773fcb5
|
|
| MD5 |
16b6b5b0009c537c359e390e4ea827cf
|
|
| BLAKE2b-256 |
ea1fb57c8252a292ca0f54fc863aa8b93b399a9839b0f970fafbbf2a3465dde9
|
Provenance
The following attestation bundles were made for aquascope-0.4.0-py3-none-any.whl:
Publisher:
publish.yml on Rekin226/aquascope
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aquascope-0.4.0-py3-none-any.whl -
Subject digest:
16fe232100ef5dbf64ec1a5f912b35e30fc2f6a8616dd19278122bf46773fcb5 - Sigstore transparency entry: 1557815374
- Sigstore integration time:
-
Permalink:
Rekin226/aquascope@ad126ff359ffe92c7d239256f08e7a821cff31d5 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/Rekin226
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ad126ff359ffe92c7d239256f08e7a821cff31d5 -
Trigger Event:
release
-
Statement type: