Skip to main content

A Python package for downloading and reading hydrological datasets

Project description

hydrodataset

image image

A Python package for downloading and reading hydrological datasets

Installation

It is quite easy to install hydrodataset. We provide a pip package to install:

pip install hydrodataset

I highly recommend you to install this package in a virtual environment, so that it won't have negative impact on other packages in your base environment.

for example:

# xxx is your env's name, such as hydrodataset
conda create -n xxx python=3.10
# activate the env
conda activate xxx
# install hydrodataset
conda install pip
pip install hydrodataset

Usage

1. Download datasets

There are many datasets similar to CAMELS(-US), including CAMELS-AUS (Australia), CAMELS-BR (Brazil), CAMELS-CH (Switzerland), CAMELS-CL (Chile), CAMELS-DE (Germany), CAMELS-DK (Denmark), CAMELS-GB (Great Britain), CAMELS-SE (Sweden), LamaH-CE, HYSETS. Recently, a new dataset named Caravan is released, which is a global dataset.

Now we only support auto-downloading for CAMELS-US (later for others), but I highly recommend you download them manually, as the downloading is not stable sometimes because of unstable web connections to the servers of these datasets in different places in the world.

the download links:

put these downloaded files in the directory organized as follows:

camels/
├─ camels_aus/
│  ├─ 01_id_name_metadata.zip
│  ├─ 02_location_boundary_area.zip
│  ├─ 03_streamflow.zip
│  ├─ 04_attributes.zip
│  ├─ 05_hydrometeorology.zip
├─ camels_br/
│  ├─ 01_CAMELS_BR_attributes.zip
│  ├─ 02_CAMELS_BR_streamflow_m3s.zip
│  ├─ 03_CAMELS_BR_streamflow_mm_selected_catchments.zip
│  ├─ 04_CAMELS_BR_streamflow_simulated.zip
│  ├─ 05_CAMELS_BR_precipitation_chirps.zip
│  ├─ 06_CAMELS_BR_precipitation_mswep.zip
│  ├─ 07_CAMELS_BR_precipitation_cpc.zip
│  ├─ 08_CAMELS_BR_evapotransp_gleam.zip
│  ├─ 09_CAMELS_BR_evapotransp_mgb.zip
│  ├─ 10_CAMELS_BR_potential_evapotransp_gleam.zip
│  ├─ 11_CAMELS_BR_temperature_min_cpc.zip
│  ├─ 12_CAMELS_BR_temperature_mean_cpc.zip
│  ├─ 13_CAMELS_BR_temperature_max_cpc.zip
│  ├─ 14_CAMELS_BR_catchment_boundaries.zip
│  ├─ 15_CAMELS_BR_gauges_location_shapefile.zip
├─ camels_ch/
│  ├─ camels_ch.zip
├─ camels_cl/
│  ├─ 10_CAMELScl_tmean_cr2met.zip
│  ├─ 11_CAMELScl_pet_8d_modis.zip
│  ├─ 12_CAMELScl_pet_hargreaves.zip
│  ├─ 13_CAMELScl_swe.zip
│  ├─ 14_CAMELScl_catch_hierarchy.zip
│  ├─ 1_CAMELScl_attributes.zip
│  ├─ 2_CAMELScl_streamflow_m3s.zip
│  ├─ 3_CAMELScl_streamflow_mm.zip
│  ├─ 4_CAMELScl_precip_cr2met.zip
│  ├─ 5_CAMELScl_precip_chirps.zip
│  ├─ 6_CAMELScl_precip_mswep.zip
│  ├─ 7_CAMELScl_precip_tmpa.zip
│  ├─ 8_CAMELScl_tmin_cr2met.zip
│  ├─ 9_CAMELScl_tmax_cr2met.zip
│  ├─ CAMELScl_catchment_boundaries.zip
├─ camels_de/
│  ├─ camels_de.zip
├─ camels_dk/
│  ├─ Attributes
│  ├─ Dynamics
│  ├─ Shapefile
├─ camels_gb/
│  ├─ 8344e4f3-d2ea-44f5-8afa-86d2987543a9.zip
├─ camels_se/
│  ├─ catchment properties.zip
│  ├─ catchment time series.zip
│  ├─ catchment_GIS_shapefiles.zip
├─ camels_us/
│  ├─ basin_set_full_res.zip
│  ├─ basin_timeseries_v1p2_metForcing_obsFlow.zip
│  ├─ basin_timeseries_v1p2_modelOutput_daymet.zip
│  ├─ basin_timeseries_v1p2_modelOutput_maurer.zip
│  ├─ basin_timeseries_v1p2_modelOutput_nldas.zip
│  ├─ camels_attributes_v2.0.xlsx
│  ├─ camels_clim.txt
│  ├─ camels_geol.txt
│  ├─ camels_hydro.txt
│  ├─ camels_name.txt
│  ├─ camels_soil.txt
│  ├─ camels_topo.txt
│  ├─ camels_vege.txt
lamah_ce/
├─ 2_LamaH-CE_daily
hysets/
├─ HYSETS_2020_QC_stations.nc
├─ HYSETS_watershed_boundaries.zip
├─ HYSETS_watershed_properties.txt
caravan/
├─ Caravan.zip
├─ Caravan_extension_CH.zip

2. Run the code

First, run the following Python code:

import hydrodataset

then in your home directory, you will find the directory for hydrodataset:

  • Windows: C:\Users\xxx\hydro_setting.yml (xxx is your username))
  • Ubuntu: /home/xxx/hydro_setting.yml

The hydro_setting.yml file is a config file including some specific path for your datasets and some credentials for your database, such as minio and postgres.

NOTE: For this repository, we only need the "datasets-origin" of "local_data_path", so just fill in the path for "datasets-origin" in the "local_data_path" section, and leave other fields empty.

minio:
  server_url: ''
  client_endpoint: ''
  access_key: ''
  secret: ''

local_data_path:
  root: ''
  datasets-origin: 'D:\data\waterism\datasets-origin' # set your path here
  datasets-interim: ''
  basins-origin: ''
  basins-interim: ''
postgres:
  server_url: ''
  port: 0
  username: ''
  password: ''
  database: ''

Then, you can use functions in hydrodataset, examples could be seen here: https://github.com/OuyangWenyu/hydrodataset/blob/main/examples/scripts.py

NOTE: Please don't modify the interface of the functions in hydrodataset, as it may cause some errors, unless one can entirely refactor the code.

These functions are about reading attributes/forcing/streamflow data.

When you first run the code, you should set the parameter "download" to True:

import os
from hydrodataset.camels import Camels
camels = Camels(data_path=os.path.join("camels", "camels_us"), download=True, region="US")

It will unzip all downloaded files, and take some minutes, please be patient.

Except for the first run, you should set "download" to False:

import os
from hydrodataset.camels import Camels
# default is False
camels = Camels(data_path=os.path.join("camels", "camels_us"), region="US")

You can change your data_path to anywhere you put in the the root directory of hydrodataset.

Features

HydroDataset is designed to help (1) download, (2) read, (3)format and (4) visualize some datasets through a core language (Python) for watershed hydrological modeling.

Note: But now this repository is still developing and only supports quite simple functions such as downloading and reading data for watersheds.

Now the dataset zoo list includes:

Number Dataset Description
1 CAMELS CAMELS series datasets including CAMELS-AUS/BR/CH/CL/DE/DK/GB/SE/US
2 LamaH LamaH-CE dataset for Central Europe
3 HYSETS HYSETS dataset for North America
4 Caravan Caravan dataset for global

For CAMELS-CH/DE/DK/SE, we didn't finish reading functions yet, but we will finish them soon.

We highly recommend you to use xarray to read the data, as it is a powerful tool for handling multi-dimensional data. Then, you can see the units of all variables in the xarray dataset. For US, we provide full support for reading attributes, forcing, and streamflow data with such a cache-reading support, so that you can read them quickly after the first time you read them.

For others, we only provide support for reading without cache-reading support, so it may take some time to read them. We will finish cache-reading support for them soon.

For units, we use pint, and pint-xarray to handle them.

Credits

This package was created with Cookiecutter and the giswqs/pypackage project template.

It was inspired by HydroData and used some tools made by cheginit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hydrodataset-0.1.13.tar.gz (52.5 kB view details)

Uploaded Source

Built Distribution

hydrodataset-0.1.13-py2.py3-none-any.whl (49.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file hydrodataset-0.1.13.tar.gz.

File metadata

  • Download URL: hydrodataset-0.1.13.tar.gz
  • Upload date:
  • Size: 52.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for hydrodataset-0.1.13.tar.gz
Algorithm Hash digest
SHA256 566dcef3031cfc0d3363aeacce2a9be92b13130ab63541cb07f9f8943f4c021c
MD5 abea2432b98a5451f354b190619f5131
BLAKE2b-256 c744b3b62229bfb06b7cc0f61131476cad9df64b6e4aca7e5f7d9a0e7527e9c4

See more details on using hashes here.

File details

Details for the file hydrodataset-0.1.13-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for hydrodataset-0.1.13-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 0e20697edfe7f93e3774f8f6bbb12375f8f472c62678b4d90c98f08f4b3b6dde
MD5 51cf20ad4185debeaee82baca7c3ea65
BLAKE2b-256 9626e38ac490b471b5fd183c42b4b464c082d322490b2aa67b83e99e5c559c3e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page