A python project to deal with hydrological datasources

These details have not been verified by PyPI

Project description

hydrodatasource

Free software: BSD license
Documentation: https://iHeadWater.github.io/hydrodatasource

Overview

While libraries like hydrodataset exist for accessing standardized, public hydrological datasets (e.g., CAMELS), a common challenge is working with data that isn't in a ready-to-use format. This includes non-public industry data, data from local authorities, or custom datasets compiled for specific research projects.

hydrodatasource is designed to solve this problem. It provides a flexible framework to read, process, and clean these custom datasets, preparing them for hydrological modeling and analysis.

The core of this framework is the SelfMadeHydroDataset class, which allows you to easily access your own data by organizing it into a simple, predefined directory structure.

Reading Custom Datasets with `SelfMadeHydroDataset`

This is the primary use case for hydrodatasource. If you have your own basin-level time series and attribute data, you can use this class to load it seamlessly.

1. Prepare Your Data Directory

First, organize your data into the following folder structure:

/path/to/your_data_root/
    └── my_custom_dataset/              # Your dataset's name
        ├── attributes/
        │   └── attributes.csv
        ├── shapes/
        │   └── basins.shp
        └── timeseries/
            ├── 1D/                     # Sub-folder for each time resolution (e.g., daily)
            │   ├── basin_01.csv
            │   ├── basin_02.csv
            │   └── ...
            └── 1D_units_info.json      # JSON file with unit information

attributes/attributes.csv: A CSV file containing static basin attributes (e.g., area, mean elevation). Must include a basin_id column that matches the filenames in the timeseries folder.
shapes/basins.shp: A shapefile with the polygon geometry for each basin.
timeseries/1D/: A folder for each time resolution (e.g., 1D for daily, 3h for 3-hourly). Inside, each CSV file should contain the time series data for a single basin and be named after its basin_id.
timeseries/1D_units_info.json: A JSON file defining the units for each variable in your time series CSVs (e.g., {"precipitation": "mm/d", "streamflow": "m3/s"}).

2. Read the Data in Python

Once your data is organized, you can use SelfMadeHydroDataset to read it with just a few lines of code.

from hydrodatasource.reader.data_source import SelfMadeHydroDataset

# 1. Define the path to your data's parent directory and the dataset name
data_path = "/path/to/your_data_root/"
dataset_name = "my_custom_dataset"

# 2. Initialize the reader
# Specify the time units you want to work with
reader = SelfMadeHydroDataset(data_path=data_path, dataset_name=dataset_name, time_unit=["1D"])

# 3. Get a list of all available basin IDs
basin_ids = reader.read_object_ids()

# 4. Define the time range and variables you want to load
t_range = ["2000-01-01", "2010-12-31"]
variables_to_read = ["precipitation", "streamflow", "temperature"]

# 5. Read the time series data
# The result is a dictionary of xarray.Datasets, keyed by time unit
timeseries_data = reader.read_ts_xrdataset(
    gage_id_lst=basin_ids,
    t_range=t_range,
    var_lst=variables_to_read,
    time_units=["1D"]
)

daily_data = timeseries_data["1D"]

print("Successfully loaded data:")
print(daily_data)

# You can also read the static attributes
attributes_data = reader.read_attr_xrdataset(gage_id_lst=basin_ids, var_lst=["area", "mean_elevation"])
print("\nAttributes:")
print(attributes_data)

Other Features

Beyond reading data, hydrodatasource also includes modules for:

processor: Perform advanced calculations like identifying rainfall-runoff events (dmca_esr.py) and calculating basin-wide mean rainfall from station data (basin_mean_rainfall.py).
cleaner: Clean raw time series data. This includes tools for smoothing noisy streamflow data, correcting anomalies in rainfall and water level records, and back-calculating reservoir inflow.

The usage of these modules is described in the API Reference. We will add more examples in the future.

Installation

For standard use, install the package from PyPI:

pip install hydrodatasource

Development Setup

For developers, it is recommended to use uv to manage the environment, as this project has local dependencies (e.g., hydroutils, hydrodataset).

Clone the repository:

git clone https://github.com/iHeadWater/hydrodatasource.git
cd hydrodatasource

Sync the environment with uv: This command will install all dependencies, including the local editable packages.
```
uv sync --all-extras
```

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Nov 4, 2025

0.1.1

Jun 9, 2025

0.1.0

Mar 19, 2025

0.0.10

Jan 11, 2025

0.0.8

Nov 11, 2024

0.0.7

Sep 15, 2024

0.0.6

Aug 15, 2024

0.0.5

Aug 10, 2024

0.0.4

Jul 17, 2024

0.0.3

Jul 10, 2024

0.0.2

May 20, 2024

0.0.1

Mar 28, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hydrodatasource-0.2.0.tar.gz (112.0 kB view details)

Uploaded Nov 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hydrodatasource-0.2.0-py3-none-any.whl (106.1 kB view details)

Uploaded Nov 4, 2025 Python 3

File details

Details for the file hydrodatasource-0.2.0.tar.gz.

File metadata

Download URL: hydrodatasource-0.2.0.tar.gz
Upload date: Nov 4, 2025
Size: 112.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hydrodatasource-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`8767257d7cc449b04367fe6810a19e0377b69cb862352d895c3421a044448b21`
MD5	`1281640afa8a10f24944808490304ec2`
BLAKE2b-256	`7c85d985101af72dffea52a76b9cd21fef6d63bd445eda0494dbcbd84feb8d72`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hydrodatasource-0.2.0.tar.gz:

Publisher: pypi.yml on iHeadWater/hydrodatasource

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hydrodatasource-0.2.0.tar.gz
- Subject digest: 8767257d7cc449b04367fe6810a19e0377b69cb862352d895c3421a044448b21
- Sigstore transparency entry: 666130187
- Sigstore integration time: Nov 4, 2025
Source repository:
- Permalink: iHeadWater/hydrodatasource@af7792a9eb9d2d1d7286a1ef47687b1fc3ce33cc
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/iHeadWater
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@af7792a9eb9d2d1d7286a1ef47687b1fc3ce33cc
- Trigger Event: release

File details

Details for the file hydrodatasource-0.2.0-py3-none-any.whl.

File metadata

Download URL: hydrodatasource-0.2.0-py3-none-any.whl
Upload date: Nov 4, 2025
Size: 106.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hydrodatasource-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0ee23daecad6869d84926f26bfdedbfdb8810e45cc487199c55447285f7aa478`
MD5	`6adae0471cc79e4b4658f2249600f856`
BLAKE2b-256	`706c958217ef0fe51973e4fca8db5cb21870f6e717dc9361ae18c5f830d91891`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hydrodatasource-0.2.0-py3-none-any.whl:

Publisher: pypi.yml on iHeadWater/hydrodatasource

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hydrodatasource-0.2.0-py3-none-any.whl
- Subject digest: 0ee23daecad6869d84926f26bfdedbfdb8810e45cc487199c55447285f7aa478
- Sigstore transparency entry: 666130206
- Sigstore integration time: Nov 4, 2025
Source repository:
- Permalink: iHeadWater/hydrodatasource@af7792a9eb9d2d1d7286a1ef47687b1fc3ce33cc
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/iHeadWater
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@af7792a9eb9d2d1d7286a1ef47687b1fc3ce33cc
- Trigger Event: release

hydrodatasource 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

hydrodatasource

Overview

Reading Custom Datasets with `SelfMadeHydroDataset`

1. Prepare Your Data Directory

2. Read the Data in Python

Other Features

Installation

Development Setup

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

hydrodatasource 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

hydrodatasource

Overview

Reading Custom Datasets with SelfMadeHydroDataset

1. Prepare Your Data Directory

2. Read the Data in Python

Other Features

Installation

Development Setup

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Reading Custom Datasets with `SelfMadeHydroDataset`