Skip to main content

Pythonic, typed, modern client for the Hubeau water data APIs.

Project description

hubeau-data

PyPI - Version PyPI - Downloads CI Python Version Checked with mypy Linting: ruff License: MIT Package Manager: uv

Typed, modern Python client for the Hub'Eau water data APIs.

Hub'Eau exposes 15+ REST APIs for French national water data — but no official typed Python client exists. This library fills that gap: Pydantic v2 models, strict typing, and a clean interface ready for data science workflows.

Installation

From PyPI (latest stable release):

pip install hubeau-data
# or
uv add hubeau-data

For development (clone + editable install with all tools):

git clone https://github.com/pfei/hubeau-data.git
cd hubeau-data
uv sync                # core dependencies
uv sync --all-extras   # with optional extras (pandas, etc.) if defined
uv run ruff check .          # lint
uv run mypy .                # type check
uv run pytest -m "not live"  # fast mocked tests — no network required
uv run pytest -m "live" -s   # live integration tests against real Hub'Eau APIs

Quickstart

from hubeau_data.client import HubeauClient
from hubeau_data.models.hydrometrie import ObservationTrParams
from hubeau_data.models.qualite_rivieres import StationPcParams

client = HubeauClient()

# Hydrométrie — real-time observations
# Note: use code_entite (not code_station) to filter observations_tr
params = ObservationTrParams(code_entite=["O001004003"], grandeur_hydro=["Q"], size=3)
observations = client.hydrometrie.get_observations_tr(params=params)
print(observations.count)                        # total records available server-side
print(observations.data[0].date_obs, observations.data[0].resultat_obs)
print(observations.next_cursor)                  # pass to next call to paginate

# Qualité Rivières — water quality stations
stations = client.qualite_rivieres.get_stations(
    params=StationPcParams(code_departement=["75"], size=3)
)
print(stations.count)
print(stations.data[0].code_station, stations.data[0].libelle_station)

# Eau potable — drinking water analyses for a commune
from hubeau_data.models.eau_potable import ResultatEauPotableParams
resultats = client.eau_potable.get_resultats_dis(
    params=ResultatEauPotableParams(code_commune=["75056"], size=5)
)
print(resultats.data[0].libelle_parametre, resultats.data[0].resultat_numerique)

# Phytopharmaceutiques — national pesticide sales
from hubeau_data.models.phytopharmaceutiques import VenteSubstanceParams
ventes = client.phytopharmaceutiques.get_ventes_substances(
    params=VenteSubstanceParams(type_territoire="National", size=5)
)
print(ventes.data[0].libelle_substance, ventes.data[0].quantite, ventes.data[0].annee)

# API health check — works on every API
report = client.hydrometrie.check_health(n_requests=3)
print(report.summary())

# Data coverage — spot-check stations
cov = client.hydrometrie.data_coverage(code_station="O001004003")
print(cov.summary())

Async client

For bulk data collection — e.g. fetching many stations before inserting into a database — AsyncHubeauClient mirrors the sync client and supports asyncio.gather() for parallel requests. Concurrency is capped per API via an asyncio.Semaphore (default: 5, configurable via max_concurrent):

import asyncio
from hubeau_data.async_client import AsyncHubeauClient
from hubeau_data.models.hydrometrie import ObservationTrParams

async def main():
    # max_concurrent=3: at most 3 simultaneous requests to the hydrometrie API
    async with AsyncHubeauClient(max_concurrent=3) as client:
        codes = ["O001004003", "K418001001", "A1234567"]
        tasks = [
            client.hydrometrie.get_observations_tr(
                params=ObservationTrParams(code_entite=[c], grandeur_hydro=["Q"], size=10)
            )
            for c in codes
        ]
        results = await asyncio.gather(*tasks)
        for code, obs in zip(codes, results):
            print(code, obs.count, "total /", len(obs.data), "fetched")

asyncio.run(main())

All 11 APIs are available on AsyncHubeauClient with the same method names as the sync client (get_sites, get_stations, etc.) — just await them. Retry logic (tenacity) applies to async requests too. check_health and data_coverage are sync-only (diagnostic tools, not bulk operations).

API Coverage

API Status Notes
Hydrométrie ✅ Supported Sites, stations, real-time and elaborated observations
Qualité des cours d'eau ⚠️ Partial Stations and analyses. Upstream API has known stability issues
Piézométrie ✅ Supported Stations, chroniques, chroniques temps réel
Qualité des nappes ⚠️ Partial Stations and analyses. Known 503/timeout issues
Écoulement ✅ Supported Stations, observations, campaigns
Température ✅ Supported Stations and chroniques
Prélèvements en eau ✅ Supported Ouvrages, points de prélèvement, chroniques
Hydrobiologie ✅ Supported Stations, indices (IBGN/IBMR/IBD/IPR), taxons
Poisson ✅ Supported Stations, indicateurs IPR/IPR+, observations, operations
Qualité eau potable ✅ Supported Communes/UDI links, analysis results
Phytopharmaceutiques ✅ Supported Purchases and sales by substance and product
Surveillance Littoral 🚫 Skipped API being decommissioned by Hub'Eau
Indicateurs Services 🚧 Maintenance API under maintenance — see services.eaufrance.fr

All supported APIs expose check_health(n_requests) and data_coverage(...), and are available on both HubeauClient (sync) and AsyncHubeauClient (async, except health/coverage).

Features

  • Pydantic v2 models for all responses — strict runtime validation, IDE autocomplete
  • Typed query Params models for every endpoint — no more **kwargs
  • Sync (HubeauClient) and async (AsyncHubeauClient) clients, same method names
  • Automatic retry with exponential backoff (tenacity) on transient errors — Hub'Eau APIs have known stability issues
  • check_health(n_requests) — latency stats per endpoint, healthy ratio
  • data_coverage(...) — data availability windows per station or territory
  • Optional extras: [dataframe], [geo], [viz] — install only what you need

Stack

  • Python 3.13+, mypy --strict, ruff, uv, hatchling, src-layout
  • httpx + tenacity for resilient sync/async HTTP
  • pytest-httpx mocked test suite — CI runs without network dependency

Examples & Scripts

uv run python examples/demo.py
uv run jupyter lab            # open examples/demo.ipynb

Health check scripts for every API under scripts/<api>/check_health.py:

uv run python scripts/hydrometrie/check_health.py --n-requests 3 --random
uv run python scripts/qualite_rivieres/check_health.py --n-requests 2
uv run python scripts/eau_potable/check_health.py --commune 75056
uv run python scripts/phytopharmaceutiques/check_health.py

Exploration scripts under scripts/qualite_rivieres/ and scripts/hydrometrie/.

Roadmap

  • Full Hub'Eau API coverage (11 APIs implemented)
  • check_health and data_coverage on all APIs
  • Typed Params models for every endpoint
  • Automatic retry with exponential backoff (tenacity)
  • Async client (AsyncHubeauClient, all 11 APIs)
  • Optional dependency groups — pandas, geopandas, matplotlib as extras
  • CHANGELOG.md + CONTRIBUTING.md
  • PyPI release (0.1.0, 0.2.0)
  • PagedResponse[T] — all get_* methods expose count, data, next_cursor
  • Rate limiting in async client (Semaphore)
  • Full audit of query parameter names across remaining APIs

License

MIT © Pierre Feilles

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hubeau_data-0.3.0.tar.gz (167.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hubeau_data-0.3.0-py3-none-any.whl (53.2 kB view details)

Uploaded Python 3

File details

Details for the file hubeau_data-0.3.0.tar.gz.

File metadata

  • Download URL: hubeau_data-0.3.0.tar.gz
  • Upload date:
  • Size: 167.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for hubeau_data-0.3.0.tar.gz
Algorithm Hash digest
SHA256 31f06806f2403973f14b3257ffb45260e351569d4766827cacb2185f1fca7f15
MD5 4460cec06dd4f68e52b57b5bd928f228
BLAKE2b-256 66614637a3dd7ce7392c3ddfe3695d9c5526755ad314e4ef8dcd500bf689a7b0

See more details on using hashes here.

File details

Details for the file hubeau_data-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: hubeau_data-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 53.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for hubeau_data-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8a92ef02a90be0fbd4162312bf20c9a80868d637832f766ca25d04ba1517faa0
MD5 211b5358a626035b67ece1e2f7134a1f
BLAKE2b-256 55a98b1f12cc4f14e1764daefb2f43d32c7908ed783b925b944b0430aa7821c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page