Skip to main content

Open-source water data aggregation toolkit with AI-powered research methodology recommendations

Project description

AquaScope logo

AquaScope

Open-source Python toolkit for water data, hydrology, and agricultural water management — with an AI engine that recommends and auto-executes research methodologies.

CI PyPI version PyPI downloads Python License: MIT Code style: ruff Tests

GitHub stars GitHub forks

Install · Examples · CLI · Features · Docs · Roadmap · Discussions


AquaScope unifies 12 global water-data APIs behind one Python schema, then layers a full scientific computing stack on top — from Bulletin 17C flood frequency to FAO-56 crop water requirements — wrapped in an AI engine that scores 26 research methodologies against your dataset and auto-executes 7 analysis pipelines. Validated against the CAMELS benchmark with 534 tests.


✨ What you can do

  • 🌊 Pull water data from USGS, FAO AQUASTAT, FAO WaPOR, GEMStat, EU WFD, Copernicus ERA5, Taiwan MOENV/WRA, Japan MLIT, Korea WAMIS, OpenMeteo, UN SDG 6 — one unified Python API.
  • 📈 Run hydrological analyses — Bulletin 17C flood frequency (GEV / LP3 / Gumbel / non-stationary GEV / EMA), baseflow separation, rating curves, 22 hydrological signatures.
  • 🌾 Plan agricultural water — FAO-56 Penman-Monteith ET₀, crop water requirements for 20 crops, irrigation scheduling, soil water balance with auto-irrigation.
  • 🤖 Ask the AI engine — describe your goal in plain English and get a recommended methodology, scored against your dataset profile and auto-executed.
  • 📊 Visualise + report — 16 plot types, Q-Q / P-P diagnostics, Markdown / HTML reports with embedded figures, threshold alerts (WHO / EPA / EU WFD).
  • 🗺️ Spatial hydrology — DEM processing, D8 flow direction, watershed delineation, Strahler ordering.

For the full capability list see docs/features.md.

📊 Why AquaScope

AquaScope HEC-SSP R lmom Standalone collectors
Bulletin 17C FFA + EMA partial
Non-stationary GEV partial
Baseflow separation (Lyne-Hollick, Eckhardt)
FAO-56 Penman-Monteith ET₀ + crop water
12 unified data collectors per-source
AI methodology recommender
Interactive Streamlit dashboard
Free, MIT, Python-native partial varies

⚡ Install

pip install aquascope              # core — collectors + hydrology
pip install "aquascope[all]"       # everything — ML, viz, spatial, dashboard

Feature-group extras:

pip install "aquascope[ml]"           # sklearn, xgboost, statsmodels
pip install "aquascope[viz]"          # matplotlib, seaborn, folium
pip install "aquascope[scientific]"   # xarray, netcdf4, h5py
pip install "aquascope[spatial]"      # rasterio, geopandas, shapely
pip install "aquascope[dashboard]"    # streamlit
pip install "aquascope[forecast]"     # prophet, torch (for LSTM)

For development:

git clone https://github.com/Rekin226/aquascope.git
cd aquascope
pip install -e ".[all,dev]"

🚀 Examples

1. Flood frequency analysis (Bulletin 17C)

from aquascope.api import flood_analysis

result = flood_analysis(daily_discharge, method="gev", return_periods=[10, 50, 100])
print(result.return_levels)
#   return_period  return_level  lower_ci  upper_ci
# 0           10        1840.2     1690.4    2010.6
# 1           50        2530.7     2280.1    2820.9
# 2          100        2870.4     2540.6    3260.5

Switch method to "lp3", "gumbel", "gpd", or "ns_gev" for non-stationary analysis. Pass censored=True for EMA on records with peak-over-threshold gaps.

2. Baseflow separation + hydrological signatures

from aquascope.api import baseflow_analysis, compute_all_signatures

bf  = baseflow_analysis(daily_discharge, method="eckhardt")   # or "lyne_hollick"
sig = compute_all_signatures(daily_discharge)

print(bf.bfi)                  # baseflow index, e.g. 0.42
print(sig["Q5"], sig["Q95"])   # high-flow / low-flow exceedances
print(sig["flashiness"])       # Richards-Baker flashiness index

22 signatures across magnitude, variability, timing, recession, and flashiness — see docs/features.md.

3. Collect data from any of the 12 sources

from aquascope.collectors import USGSCollector, AquastatCollector, WaporCollector

usgs = USGSCollector()
flow = usgs.collect(station_id="01646500", parameter="00060", days=365)

aquastat = AquastatCollector()
egy_water = aquastat.collect(country="EGY", variables=[4263, 4253, 4312])

wapor = WaporCollector()
et = wapor.collect(
    bbox=(30.5, 29.8, 31.1, 30.2),
    variable="RET",
    start_date="2026-04-01",
    end_date="2026-07-31",
)

Every collector returns records in the same Pydantic schema, so downstream analyses don't care where the data came from. See docs/data_sources.md for the full list.

4. FAO-56 crop water requirements + soil water balance

from datetime import date
from aquascope.agri import (
    penman_monteith_daily,
    crop_water_requirement,
    SoilWaterBalance,
)
from aquascope.agri.water_balance import SoilProperties

# Reference ET (FAO-56 Penman-Monteith) — Cairo, July
eto = penman_monteith_daily(
    t_min=18.0, t_max=32.0, rh_min=40, rh_max=80,
    u2=2.0, rs=22.0, latitude=30.0, elevation=70, doy=180,
)

# Crop water requirement for maize from planting through harvest
cwr = crop_water_requirement(eto_series, crop="maize", planting_date=date(2026, 4, 1))

# Soil water balance with auto-irrigation triggers
soil    = SoilProperties(field_capacity=0.30, wilting_point=0.15, root_depth=1.0)
balance = SoilWaterBalance(soil).auto_irrigate(
    etc=cwr.etc, precip=precip_series, efficiency=0.7,
)
print(balance.total_irrigation_mm, balance.deficit_days)

5. AI methodology recommender

from aquascope.ai_engine import recommend

# Describe your dataset and goal — get ranked, scored methodologies
recs = recommend(
    parameters=["DO", "BOD5", "COD"],
    n_records=4_500,
    temporal=True,
    spatial=False,
    goal="detect long-term pollution trends with seasonality",
)

for r in recs[:3]:
    print(f"{r.score:.2f}  {r.method_id:<20}  {r.rationale}")
# 0.92  mann_kendall          Strong fit: temporal data, >30 records, trend goal
# 0.87  stl_decomposition     Seasonal patterns + multi-year data
# 0.81  prophet               Forecasting-capable, handles seasonality natively

Then auto-execute the top result with run_pipeline(recs[0].method_id, df).

6. Change-point detection + copula dependence

from aquascope.api import detect_changepoints, fit_copula

cps  = detect_changepoints(annual_runoff, method="pettitt")
cop  = fit_copula(rainfall, runoff, family="auto")    # AIC-selects Gaussian/Clayton/Gumbel/Frank
print(cps.change_year, cps.p_value)
print(cop.family, cop.theta, cop.aic)

7. Bayesian regression with uncertainty quantification

from aquascope.api import bayesian_regression

# Annual rainfall → runoff with full posterior + convergence diagnostics
posterior = bayesian_regression(X=annual_precip, y=annual_runoff)

print(posterior.posterior_mean)
# {'beta_0': 12.4, 'beta_1': 0.82, 'sigma2': 41.6}

print(posterior.credible_intervals["beta_1"])
# (0.78, 0.86)        ← 95% credible interval on slope

print(posterior.r_hat)
# {'beta_0': 1.00, 'beta_1': 1.00, 'sigma2': 1.00}    ← Gelman–Rubin, converged

print(posterior.dic, posterior.effective_sample_size["beta_1"])
# 124.7  9842.0       ← model fit + effective sample size

Switch to MCMC with degree>1 for polynomial models, or pass prior_precision for informative priors. Conjugate linear, polynomial, and Metropolis-Hastings backends are all available.


💻 CLI

AquaScope ships a 14-command CLI for the most common workflows:

# Collect data
aquascope collect --source usgs --station 01646500 --days 365
aquascope collect --source wapor --bbox 30.5,29.8,31.1,30.2 --variable RET --start-date 2026-04-01

# Hydrological analysis
aquascope hydro --method flood_frequency --file discharge.csv
aquascope hydro --method baseflow --file discharge.csv

# Agriculture planning
aquascope agri plan --crop maize --planting-date 2026-04-01 --lat 30.0 --lon 31.25

# AI recommendation + natural-language problem solving
aquascope recommend --parameters DO,BOD5,COD --goal "pollution trend detection"
aquascope solve --problem "Assess flood risk for a 100-year return period"

# Interactive Streamlit dashboard
aquascope dashboard

Run aquascope --help for the full command list.


🌍 Data sources at a glance

12 unified data sources spanning four regions:

  • 🌎 Americas — USGS (streamflow + WQ), Water Quality Portal (400+ agencies)
  • 🌍 Europe — EU Water Framework Directive, Copernicus ERA5
  • 🌏 Asia-Pacific — Taiwan MOENV / WRA / Civil IoT, Japan MLIT, Korea WAMIS
  • 🌐 Global — GEMStat (170 countries), UN SDG 6, OpenMeteo, FAO AQUASTAT, FAO WaPOR

Full details, endpoints, and API-key requirements: docs/data_sources.md. Want to add your country's water service? See adding a data source.


🧪 Scientifically validated

  • 534 tests — covering every collector, hydrology method, and pipeline
  • CAMELS benchmark — a 10-catchment validation subset of the CAMELS dataset ships with the repo at data/camels_benchmark/ and runs as part of CI
  • Every method cited — equations, decision trees, and DOI references for all 26 methodologies live in the theory guide
  • JOSS paper in submission — see paper.md and paper.bib

📚 Documentation

Resource What it covers
Features Full capability list — hydrology, agriculture, ML, spatial, I/O
Data sources All 12 sources, endpoints, API-key requirements
Theory guide Equations, DOI citations, decision trees for every method
Methodology matrix When to use which method
Architecture How AquaScope is structured internally
FAQ · Troubleshooting Common questions and fixes
Use cases Real-world applications and case studies
Integration guides xarray, QGIS, R interoperability
Contributing How to add a data source, methodology, or test

🤝 Contributing

We welcome contributions from the global water and agriculture research community. Highest-impact contributions right now:

  • New data source collectors — your country / region
  • New research methodologies — expand the AI recommender
  • New crop coefficients — extend the FAO Kc table
  • Jupyter tutorials and validation studies — compare against HEC-SSP, R packages, etc.

See CONTRIBUTING.md, the adding a data source guide, and the adding a methodology guide.

📜 Citation

If you use AquaScope in your research, please cite:

@software{aquascope2026,
  title   = {AquaScope: Open-Source Water Data Aggregation, Hydrological Analysis, and Agricultural Water Management Toolkit},
  author  = {AquaScope Contributors},
  year    = {2026},
  url     = {https://github.com/Rekin226/aquascope},
  version = {0.4.0},
  license = {MIT}
}

📄 License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aquascope-0.4.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aquascope-0.4.0-py3-none-any.whl (300.1 kB view details)

Uploaded Python 3

File details

Details for the file aquascope-0.4.0.tar.gz.

File metadata

  • Download URL: aquascope-0.4.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aquascope-0.4.0.tar.gz
Algorithm Hash digest
SHA256 6b588c860640ad0857c4f3d2d295e0b43f1b23ddfb0323f5379320fd4cacf37d
MD5 18027e3cf8686d73851c6e4743be0537
BLAKE2b-256 c6fd0f424ee2703868adbd93145792f0944910d010273164c8e3e5dc8138c442

See more details on using hashes here.

Provenance

The following attestation bundles were made for aquascope-0.4.0.tar.gz:

Publisher: publish.yml on Rekin226/aquascope

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aquascope-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: aquascope-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 300.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aquascope-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 16fe232100ef5dbf64ec1a5f912b35e30fc2f6a8616dd19278122bf46773fcb5
MD5 16b6b5b0009c537c359e390e4ea827cf
BLAKE2b-256 ea1fb57c8252a292ca0f54fc863aa8b93b399a9839b0f970fafbbf2a3465dde9

See more details on using hashes here.

Provenance

The following attestation bundles were made for aquascope-0.4.0-py3-none-any.whl:

Publisher: publish.yml on Rekin226/aquascope

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page