Hydrological modeling package
Project description
hydromodel
A lightweight Python package for hydrological model calibration and evaluation, featuring the XinAnJiang (XAJ) model.
- Free software: GNU General Public License v3
- Documentation: https://OuyangWenyu.github.io/hydromodel
What is hydromodel
hydromodel is a Python implementation of conceptual hydrological models, with a focus on the XinAnJiang (XAJ) model - one of the most widely-used rainfall-runoff models, especially in China and Asian regions.
Key Features:
- XAJ Model Variants: Standard XAJ and optimized versions (xaj_mz with MizuRoute)
- Multiple Calibration Algorithms:
- SCE-UA: Shuffled Complex Evolution with spotpy
- GA: Genetic Algorithm with DEAP
- scipy: L-BFGS-B, SLSQP, and other gradient-based methods
- Multi-Basin Support: Efficient calibration and evaluation for multiple basins simultaneously
- Unified Results Format: All algorithms save results in standardized JSON + CSV format
- Comprehensive Evaluation Metrics: NSE, KGE, RMSE, PBIAS, and more
- Unified API: Consistent interfaces for calibration, evaluation, and simulation
- Flexible Data Integration: Seamless support for CAMELS datasets via hydrodataset and custom data via hydrodatasource
- Configuration-Based Workflow: YAML configuration for reproducibility
- Progress Tracking: Real-time progress display and intermediate results saving
Why hydromodel?
For Researchers:
- Battle-tested XAJ implementations used in published research
- Configuration-based workflow ensures reproducibility
- Easy to extend with new models or calibration algorithms
For Practitioners:
- Simple YAML configuration, minimal coding required
- Handles multi-basin calibration efficiently
- Integration with global CAMELS series datasets (20+ variants)
- Clear documentation and examples
Installation
For Users
pip install hydromodel hydrodataset hydrodatasource
Or using uv (faster):
uv pip install hydromodel hydrodataset hydrodatasource
Development Setup
For developers, it is recommended to use uv to manage the environment, as this project has local dependencies (e.g., hydroutils, hydrodataset, hydrodatasource).
-
Clone the repository:
git clone https://github.com/OuyangWenyu/hydromodel.git cd hydromodel
-
Sync the environment with
uv: This command will install all dependencies, including the local editable packages.uv sync --all-extras
Configuration
Option 1: Use Default Paths (Recommended for Quick Start)
No configuration needed! hydromodel automatically uses default paths:
Default data directory:
- Windows:
C:\Users\YourUsername\hydromodel_data\ - macOS/Linux:
~/hydromodel_data/
The default structure (aqua_fetch automatically creates uppercase dataset directories):
~/hydromodel_data/
├── datasets-origin/
│ ├── CAMELS_US/ # CAMELS US dataset (created by aqua_fetch)
│ ├── CAMELS_AUS/ # CAMELS Australia dataset (if used)
│ └── ... # Other datasets
├── datasets-interim/ # Your custom basin data
└── ...
Option 2: Custom Paths (For Advanced Users)
Create ~/hydro_setting.yml to specify custom paths:
local_data_path:
root: 'D:/data'
datasets-origin: 'D:/data' # For CAMELS datasets (aqua_fetch adds CAMELS_US automatically)
datasets-interim: 'D:/data/my_basins' # For custom data
Important: For CAMELS datasets, provide only the datasets-origin directory. The system automatically appends the uppercase dataset directory name (e.g., CAMELS_US, CAMELS_AUS). If your data is in D:/data/CAMELS_US/, set datasets-origin: 'D:/data'.
How to Use
1. Data Preparation
Using CAMELS Datasets (hydrodataset):
Getting public datasets using hydrodataset
pip install hydrodataset
Run the following code to download data to your directory
from hydrodataset.camels_us import CamelsUs
# Auto-downloads if not found. Provide datasets-origin directory (e.g., "D:/data")
# aqua_fetch automatically appends dataset name, creating "D:/data/CAMELS_US/"
ds = CamelsUs(data_path)
basin_ids = ds.read_object_ids() # Get basin IDs
Note: First-time download may take some time. The complete CAMELS dataset is approximately 70GB (including zipped and unzipped files).
Available datasets: please see README.md in hydrodataset
Using Custom Data (hydrodatasource):
For your own data to be read using hydrodatasource, it needs to be prepared in the format of selfmadehydrodataset :
pip install hydrodatasource
Data structure:
/path/to/your_data_root/
└── my_custom_dataset/ # your dataset name
├── attributes/
│ └── attributes.csv
├── shapes/
│ └── basins.shp
└── timeseries/
├── 1D/ # One sub folder per time resolution (e.g. 1D/3h/1h)
│ ├── basin_01.csv
│ ├── basin_02.csv
│ └── ...
└── 1D_units_info.json # JSON file containing unit information
Required files and formats:
-
attributes/attributes.csv: Basin metadata with required columns
basin_id: Unique basin identifier (e.g., "basin_001")area: Basin area in km² (mapped tobasin_areainternally)- Additional columns: Any basin attributes (e.g., elevation, slope)
-
shapes/basins.shp: Basin boundary shapefiles (all 4 files required: .shp, .shx, .dbf, .prj)
- Must contain
BASIN_IDcolumn (uppercase) matching basin IDs in attributes.csv - Geometries: Polygon features defining basin boundaries
- Coordinate system: Any valid CRS (e.g., EPSG:4326 for WGS84)
- Must contain
-
timeseries/{time_scale}/{basin_id}.csv: Time series data for each basin
time: Datetime column (e.g., "2010-01-01")- Variable columns:
prcp,PET,streamflow(or your chosen variable names) - Format: CSV with header row
-
timeseries/{time_scale}_units_info.json: Variable units metadata
- JSON format:
{"variable_name": "unit"}(e.g.,{"prcp": "mm/day"}) - Must match variable names in time series files
- JSON format:
For detailed format specifications and examples, see:
- Data Guide - Complete guide for both CAMELS and custom data
- hydrodatasource documentation - Source package
configs/example_config_selfmade.yaml- Complete configuration example for custom datasets
2. Quick Start: Calibration, Evaluation, Simulation, and Visualization
Option 1: Use Command-Line Scripts (Recommended for Beginners)
We provide ready-to-use scripts for model calibration, evaluation, simulation, and visualization:
# 1. Calibration (saves config files by default)
python scripts/run_xaj_calibration.py --config configs/example_config.yaml
# 2. Evaluation on test period
python scripts/run_xaj_evaluate.py --calibration-dir results/xaj_mz_SCE_UA
# 3. Simulation with custom parameters (no calibration required!)
python scripts/run_xaj_simulate.py --config configs/example_simulate_config.yaml --param-file configs/example_xaj_params.yaml --plot
# 4. Visualization (time series plots with precipitation and streamflow)
python scripts/visualize.py --eval-dir results/xaj_mz_SCE_UA/evaluation_test
# Visualize specific basins
python scripts/visualize.py --eval-dir results/xaj_mz_SCE_UA/evaluation_test --basins 01013500
Configuration Files:
Edit the appropriate configuration file for your data type:
configs/example_config.yaml- For continuous time series data (e.g., CAMELS datasets)configs/example_config_selfmade.yaml- For custom data and flood event datasets
All configuration options work with the same unified API. For detailed flood event data usage, see Usage Guide - Flood Event Data.
Option 2: Use Python API (For Advanced Users)
from hydromodel.trainers.unified_calibrate import calibrate
from hydromodel.trainers.unified_evaluate import evaluate
config = {
"data_cfgs": {
"data_source_type": "camels_us",
"basin_ids": ["01013500"],
"train_period": ["1985-10-01", "1995-09-30"],
"test_period": ["2005-10-01", "2014-09-30"],
"warmup_length": 365,
"variables": ["precipitation", "potential_evapotranspiration", "streamflow"]
},
"model_cfgs": {
"model_name": "xaj_mz",
},
"training_cfgs": {
"algorithm_name": "SCE_UA",
"algorithm_params": {"rep": 5000, "ngs": 1000},
"loss_config": {"type": "time_series", "obj_func": "RMSE"},
"output_dir": "results",
"experiment_name": "my_experiment",
},
"evaluation_cfgs": {
"metrics": ["NSE", "KGE", "RMSE"],
},
}
results = calibrate(config) # Calibrate
evaluate(config, param_dir="results/my_experiment", eval_period="test") # Evaluate
Results are saved in the results/ directory.
Core API
Configuration Structure
The unified API uses a configuration dictionary with four main sections:
config = {
"data_cfgs": {
"data_source_type": "camels_us", # Dataset type
"basin_ids": ["01013500"], # Basin IDs to calibrate
"train_period": ["1990-10-01", "2000-09-30"],
"test_period": ["2000-10-01", "2010-09-30"],
"warmup_length": 365, # Warmup days
"variables": ["precipitation", "potential_evapotranspiration", "streamflow"],
},
"model_cfgs": {
"model_name": "xaj_mz", # Model variant
"model_params": {
"source_type": "sources",
"source_book": "HF",
"kernel_size": 15, # Muskingum routing kernel
},
},
"training_cfgs": {
"algorithm_name": "GA", # Algorithm: SCE_UA, GA, or scipy
# Algorithm-specific parameters (choose one based on algorithm_name)
# For SCE-UA (Shuffled Complex Evolution):
"SCE_UA": {
"rep": 1000, # Iterations (5000+ recommended)
"ngs": 1000, # Number of complexes
"kstop": 500, # Stop if no improvement
"peps": 0.1, # Parameter convergence
"pcento": 0.1, # Percentage change allowed
"random_seed": 1234,
},
# For GA (Genetic Algorithm):
"GA": {
"pop_size": 80, # Population size
"n_generations": 50, # Generations (100+ recommended)
"cx_prob": 0.7, # Crossover probability
"mut_prob": 0.2, # Mutation probability
"random_seed": 1234,
},
# For scipy (gradient-based optimization):
"scipy": {
"method": "SLSQP", # L-BFGS-B, SLSQP, TNC, etc.
"max_iterations": 500, # Maximum iterations
},
"loss_config": {
"type": "time_series",
"obj_func": "RMSE", # RMSE, NSE, or KGE
},
"output_dir": "results",
"experiment_name": "my_exp",
"save_config": True, # Save config files to output directory (default: True)
},
"evaluation_cfgs": {
"metrics": ["NSE", "KGE", "RMSE", "PBIAS"],
},
}
Configuration for custom datasets:
See configs/example_config_selfmade.yaml for a complete example. Custom datasets require additional parameters:
"data_cfgs": {
"dataset": "selfmadehydrodataset" # or "floodevent" for flood event data
"dataset_name": "my_basin_data" # Your dataset folder name (REQUIRED)
"time_unit": ["1D"] # Time resolution (e.g., ["1h"], ["3h"], ["1D"])
"datasource_kwargs":{ # Optional additional parameters
"offset_to_utc": False # Whether to convert local time to UTC
}
"is_event_data": True # Whether floodevent data
# ... other standard parameters (basin_ids, variables, periods, etc.)
Key differences from CAMELS datasets:
dataset_name: Specifies your custom dataset folder name (required)time_unit: Must match the subdirectory names intimeseries/folderdatasource_kwargs: Optional parameters for data preprocessing
Calibration API
from hydromodel.trainers.unified_calibrate import calibrate
results = calibrate(config)
Output: Calibration results saved to {output_dir}/{experiment_name}/
Saved files:
results/my_exp/
├── calibration_results.json # Best parameters for all basins (unified format)
├── {basin_id}_sceua.csv # SCE-UA detailed iteration history
├── {basin_id}_ga.csv # GA generation history with parameters
├── {basin_id}_scipy.csv # scipy iteration history with parameters
├── calibration_config.yaml # Configuration used (saved if save_config=True)
└── param_range.yaml # Parameter ranges for current model only (saved if save_config=True)
Notes:
calibration_results.json: Always saved, contains best parameterscalibration_config.yamlandparam_range.yaml: Only saved ifsave_config=True(default)param_range.yaml: Contains parameter ranges for the current model only (e.g., onlyxaj_mz, not all models)- In
calibration_config.yaml,param_range_fileis set to the actual saved path
Available algorithms:
SCE_UA/sceua: Shuffled Complex Evolution (recommended for global optimization)GA/genetic_algorithm: Genetic Algorithm with DEAP (flexible, handles complex landscapes)scipy/scipy_minimize: scipy.optimize methods (fast for smooth objectives)
Evaluation API
from hydromodel.trainers.unified_evaluate import evaluate
# Evaluate on test period
test_results = evaluate(config, param_dir="results/my_exp", eval_period="test")
# Evaluate on training period
train_results = evaluate(config, param_dir="results/my_exp", eval_period="train")
# Evaluate on custom period
custom_results = evaluate(
config,
param_dir="results/my_exp",
eval_period="custom",
custom_period=["2010-10-01", "2015-09-30"]
)
Output: Evaluation results in {param_dir}/evaluation_{period}/
basins_metrics.csv- Performance metricsbasins_norm_params.csv- Calibrated parameters (normalized [0,1])basins_denorm_params.csv- Denormalized parameters (physical values)xaj_mz_evaluation_results.nc- Full simulation results (NetCDF)
Parameter Loading Priority:
calibration_results.json(⭐ Recommended, works for all algorithms){basin_id}_ga.csv(GA algorithm CSV){basin_id}_scipy.csv(scipy algorithm CSV){basin_id}_sceua.csv(SCE-UA algorithm CSV){basin_id}_calibrate_params.txt(Legacy format)
Available metrics: NSE, KGE, RMSE, PBIAS, FHV, FLV, FMS
Understanding Results Format
calibration_results.json structure:
{
"01013500": {
"convergence": "success",
"objective_value": 1.234567,
"best_params": {
"xaj": {
"K": 0.567890,
"B": 0.234567,
"IM": 0.045678,
...
}
},
"algorithm_info": {
"generations": 50,
"population_size": 80,
...
}
}
}
CSV files (GA/scipy) structure:
generation,objective_value,param_K,param_B,param_IM,...
0,3.456,0.567,0.234,0.045,...
1,2.345,0.589,0.256,0.047,...
Why two formats?
- JSON: Best parameters only, works with all algorithms, used by evaluation
- CSV: Full iteration/generation history, useful for convergence analysis
Simulation API
Important: Simulation does NOT require prior calibration!
UnifiedSimulator provides a flexible interface for running model simulations with any parameter values:
from hydromodel.trainers.unified_simulate import UnifiedSimulator
from hydromodel.datasets.unified_data_loader import UnifiedDataLoader
# Load data
data_loader = UnifiedDataLoader(config["data_cfgs"])
p_and_e, qobs = data_loader.load_data()
# Define parameters (from calibration, literature, or custom values)
parameters = {
"K": 0.75, "B": 0.25, "IM": 0.06,
"UM": 18.0, "LM": 80.0, "DM": 95.0,
# ... other parameters
}
# Create simulator
model_config = {
"model_name": "xaj_mz",
"parameters": parameters
}
simulator = UnifiedSimulator(model_config, basin_config)
# Run simulation
results = simulator.simulate(
inputs=p_and_e,
qobs=qobs,
warmup_length=365
)
# Extract results
qsim = results["qsim"] # Simulated streamflow
Command-line usage:
# Using custom parameters (works with any parameter values)
python scripts/run_xaj_simulate.py \
--config configs/example_simulate_config.yaml \
--param-file configs/example_xaj_params.yaml \
--output simulation_results.csv \
--plot
# Using calibrated parameters from SCE-UA (CSV format)
python scripts/run_xaj_simulate.py \
--param-file results/xaj_mz_SCE_UA/01013500_sceua.csv \
--plot
Use cases:
- Parameter sensitivity analysis
- Model comparison
- Scenario testing with custom parameters
- Literature parameter validation
For detailed API documentation and advanced usage, see Usage Guide - Model Simulation.
Project Structure
hydromodel/
├── hydromodel/
│ ├── models/ # Model implementations
│ │ ├── xaj.py # Standard XAJ model
│ │ ├── gr4j.py # GR4J model
│ │ └── ...
│ ├── trainers/ # Calibration, evaluation, and simulation
│ │ ├── unified_calibrate.py # Calibration API
│ │ ├── unified_evaluate.py # Evaluation API
│ │ └── unified_simulate.py # Simulation API
│ └── datasets/ # Data preprocessing and visualization
│ ├── unified_data_loader.py # Data loader
│ ├── data_visualize.py # Visualization functions
│ └── ...
├── scripts/ # Command-line interface scripts
│ ├── run_xaj_calibration.py # Calibration script
│ ├── run_xaj_evaluate.py # Evaluation script
│ ├── run_xaj_simulate.py # Simulation script
│ └── visualize.py # Visualization CLI
├── configs/ # Configuration files
└── docs/ # Documentation
Documentation
- Quick Start: docs/quickstart.md
- Usage Guide: docs/usage.md
- API Reference: https://OuyangWenyu.github.io/hydromodel
References
- Allen, R.G., L. Pereira, D. Raes, and M. Smith, 1998. Crop Evapotranspiration, Food and Agriculture Organization of the United Nations, Rome, Italy. FAO publication 56. ISBN 92-5-104219-5. 290p.
- Duan, Q., Sorooshian, S., and Gupta, V. (1992), Effective and efficient global optimization for conceptual rainfall-runoff models, Water Resour. Res., 28( 4), 1015– 1031, doi:10.1029/91WR02985.
- François-Michel De Rainville, Félix-Antoine Fortin, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. DEAP: a python framework for evolutionary algorithms. In Proceedings of the 14th annual conference companion on Genetic and evolutionary computation (GECCO '12). Association for Computing Machinery, New York, NY, USA, 85–92. DOI:https://doi.org/10.1145/2330784.2330799
- Houska T, Kraft P, Chamorro-Chavez A, Breuer L (2015) SPOTting Model Parameters Using a Ready-Made Python Package. PLoS ONE 10(12): e0145180. https://doi.org/10.1371/journal.pone.0145180
- Mizukami, N., Clark, M. P., Sampson, K., Nijssen, B., Mao, Y., McMillan, H., Viger, R. J., Markstrom, S. L., Hay, L. E., Woods, R., Arnold, J. R., and Brekke, L. D.: mizuRoute version 1: a river network routing tool for a continental domain water resources applications, Geosci. Model Dev., 9, 2223–2238, https://doi.org/10.5194/gmd-9-2223-2016, 2016.
- Zhao, R.J., Zhuang, Y. L., Fang, L. R., Liu, X. R., Zhang, Q. S. (ed) (1980) The Xinanjiang model, Hydrological Forecasting Proc., Oxford Symp., IAHS Publication, Wallingford, U.K.
- Zhao, R.J., 1992. The xinanjiang model applied in China. J Hydrol 135 (1–4), 371–381.
Related Projects:
- hydrodataset - CAMELS and other datasets
- hydrodatasource - Data preparation utilities
- torchhydro - PyTorch-based hydrological models
Citation
If you use hydromodel in your research, please cite:
@software{hydromodel,
author = {Ouyang, Wenyu},
title = {hydromodel: A Python Package for Hydrological Model Calibration},
year = {2025},
url = {https://github.com/OuyangWenyu/hydromodel}
}
Contributing
Contributions are welcome! For major changes, please open an issue first.
git clone https://github.com/OuyangWenyu/hydromodel.git
cd hydromodel
uv sync --all-extras
pytest tests/
License
GNU General Public License v3.0 - see LICENSE file.
Contact
- Author: Wenyu Ouyang
- Email: wenyuouyang@outlook.com
- GitHub: https://github.com/OuyangWenyu/hydromodel
- Issues: https://github.com/OuyangWenyu/hydromodel/issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hydromodel-0.3.2.tar.gz.
File metadata
- Download URL: hydromodel-0.3.2.tar.gz
- Upload date:
- Size: 163.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e959d70616aff6b340d9a391f495a1b0efdefd292dcf2c343c0153306bc03ee2
|
|
| MD5 |
22754284b2226d6555c04e5ff1c9934b
|
|
| BLAKE2b-256 |
4e6998c82ce960997277c3826ec16fe18212eeeb7b475fa62e146bb3a84fa539
|
Provenance
The following attestation bundles were made for hydromodel-0.3.2.tar.gz:
Publisher:
pypi.yml on OuyangWenyu/hydromodel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hydromodel-0.3.2.tar.gz -
Subject digest:
e959d70616aff6b340d9a391f495a1b0efdefd292dcf2c343c0153306bc03ee2 - Sigstore transparency entry: 705326005
- Sigstore integration time:
-
Permalink:
OuyangWenyu/hydromodel@dfb1d3e99ad328b8a8e6e47cfe7ed9a78b9a6d38 -
Branch / Tag:
refs/tags/v0.3.2 - Owner: https://github.com/OuyangWenyu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@dfb1d3e99ad328b8a8e6e47cfe7ed9a78b9a6d38 -
Trigger Event:
release
-
Statement type:
File details
Details for the file hydromodel-0.3.2-py3-none-any.whl.
File metadata
- Download URL: hydromodel-0.3.2-py3-none-any.whl
- Upload date:
- Size: 157.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0fcb7daa1ae12c391b69d4e34d27575921355707bc4d9a0a5fd3165e507549f1
|
|
| MD5 |
4afeeed052f39c1c3c2ff8dfc3c03fff
|
|
| BLAKE2b-256 |
ac9464607d9f9abd7721c2fa9c523bbb81138680f8bf0bf959e4317762095136
|
Provenance
The following attestation bundles were made for hydromodel-0.3.2-py3-none-any.whl:
Publisher:
pypi.yml on OuyangWenyu/hydromodel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hydromodel-0.3.2-py3-none-any.whl -
Subject digest:
0fcb7daa1ae12c391b69d4e34d27575921355707bc4d9a0a5fd3165e507549f1 - Sigstore transparency entry: 705326006
- Sigstore integration time:
-
Permalink:
OuyangWenyu/hydromodel@dfb1d3e99ad328b8a8e6e47cfe7ed9a78b9a6d38 -
Branch / Tag:
refs/tags/v0.3.2 - Owner: https://github.com/OuyangWenyu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@dfb1d3e99ad328b8a8e6e47cfe7ed9a78b9a6d38 -
Trigger Event:
release
-
Statement type: