Skip to main content

Hubeau client to collect data from the different APIs

Project description

cl-hubeau

Simple hub'eau client for python

This package is currently under active development. Every API on Hub'eau will be covered by this package in due time.

At this stage, the following APIs are covered by cl-hubeau:

For any help on available kwargs for each endpoint, please refer directly to the documentation on hubeau (this will not be covered by the current documentation).

Assume that each function from cl-hubeau will be consistent with it's hub'eau counterpart, with the exception of the size and page or cursor arguments (those will be set automatically by cl-hubeau to crawl allong the results).

Configuration

First of all, you will need API keys from INSEE to use some high level operations, which may loop over cities'official codes. Please refer to pynsee's API subscription Tutorial for help.

Basic examples

Clean cache

from cl_hubeau.utils import clean_all_cache
clean_all_cache

Piezometry

3 high level functions are available (and one class for low level operations).

Get all piezometers (uses a 30 days caching):

from cl_hubeau import piezometry
gdf = piezometry.get_all_stations()

Get chronicles for the first 100 piezometers (uses a 30 days caching):

df = piezometry.get_chronicles(gdf["code_bss"].head(100).tolist())

Get realtime data for the first 100 piezometers:

A small cache is stored to allow for realtime consumption (cache expires after only 15 minutes). Please, adopt a responsible usage with this functionnality !

df = get_realtime_chronicles(gdf["code_bss"].head(100).tolist())

Low level class to perform the same tasks:

Note that :

  • the API is forbidding results > 20k rows and you may need inner loops
  • the cache handling will be your responsibility, noticely for realtime data
with piezometry.PiezometrySession() as session:
    df = session.get_chronicles(code_bss="07548X0009/F")
    df = session.get_stations(code_departement=['02', '59', '60', '62', '80'], format="geojson")
    df = session.get_chronicles_real_time(code_bss="07548X0009/F")

Hydrometry

4 high level functions are available (and one class for low level operations).

Get all stations (uses a 30 days caching):

from cl_hubeau import hydrometry 
gdf = hydrometry.get_all_stations()

Get all sites (uses a 30 days caching):

gdf = hydrometry.get_all_sites()

Get observations for the first 5 sites (uses a 30 days caching): Note that this will also work with stations (instead of sites).

df = hydrometry.get_observations(gdf["code_site"].head(5).tolist())

Get realtime data for the first 5 sites (no cache stored):

A small cache is stored to allow for realtime consumption (cache expires after only 15 minutes). Please, adopt a responsible usage with this functionnality !

df = hydrometry.get_realtime_observations(gdf["code_site"].head(5).tolist())

Low level class to perform the same tasks:

Note that :

  • the API is forbidding results > 20k rows and you may need inner loops
  • the cache handling will be your responsibility, noticely for realtime data
with hydrometry.HydrometrySession() as session:
    df = session.get_stations(code_station="K437311001")
    df = session.get_sites(code_departement=['02', '59', '60', '62', '80'], format="geojson")
    df = session.get_realtime_observations(code_entite="K437311001")
    df = session.get_observations(code_entite="K437311001")

Drinking water quality

2 high level functions are available (and one class for low level operations).

Get all water networks (UDI) (uses a 30 days caching):

from cl_hubeau import drinking_water_quality 
df = drinking_water_quality.get_all_water_networks()

Get the sanitary controls's results for nitrates on all networks of Paris, Lyon & Marseille (uses a 30 days caching) for nitrates

networks = drinking_water_quality.get_all_water_networks()
networks = networks[
    networks.nom_commune.isin(["PARIS", "MARSEILLE", "LYON"])
    ]["code_reseau"].unique().tolist()

df = drinking_water_quality.get_control_results(
    codes_reseaux=networks,
    code_parametre="1340"
)

Note that this query is heavy, even if this was already restricted to nitrates. In theory, you could also query the API without specifying the substance you're tracking, but you may hit the 20k threshold and trigger an exception.

As it is, the get_control_results function already implements a double loop:

  • on networks' codes (20 codes maximum) ;
  • on periods, requesting only yearly datasets (which should be scalable over time and should work nicely with the cache algorithm).

You can also call the same function, using official city codes directly:

df = drinking_water_quality.get_control_results(
    codes_communes=['59350'],
    code_parametre="1340"
)

Low level class to perform the same tasks:

Note that :

  • the API is forbidding results > 20k rows and you may need inner loops
  • the cache handling will be your responsibility
with drinking_water_quality.DrinkingWaterQualitySession() as session:
    df = session.get_cities_networks(nom_commune="LILLE")
    df = session.get_control_results(code_departement='02', code_parametre="1340")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cl_hubeau-0.3.0.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cl_hubeau-0.3.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file cl_hubeau-0.3.0.tar.gz.

File metadata

  • Download URL: cl_hubeau-0.3.0.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.9.19 Linux/6.5.0-1024-azure

File hashes

Hashes for cl_hubeau-0.3.0.tar.gz
Algorithm Hash digest
SHA256 d35a03a79f675d80b4fccb98000927aea9d5d7b53c05511f1a5ffffdb999aa4a
MD5 5864836374fb805bae03fa07cd51bc24
BLAKE2b-256 fc4e61baf6cf08a1ec9395b4d1565c6766dd6296b5003a8ba629a05c23c4582b

See more details on using hashes here.

File details

Details for the file cl_hubeau-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: cl_hubeau-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.9.19 Linux/6.5.0-1024-azure

File hashes

Hashes for cl_hubeau-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0f0467135d744cf969784b80c580b1099aafce21ba8f111ad93140516e3524bb
MD5 1e6f3664b98b0221676f007879bf0bc8
BLAKE2b-256 4e81ff5bc0c016b5030a852cf64c7264723b0bf97aea7610bf8f3f048c5c3ad5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page