Temporal disaggregation of daily precipitation into hourly using Q-CODA.
Project description
pyqcoda
pyqcoda is a Python library for temporal disaggregation of daily precipitation into hourly time series using a combination of comonotonicity transformation and an iterative adjusted k-nearest neighbors (KNN) algorithm. It is tailored for hydrological and climate data processing tasks where hourly data is required but only daily observations are available.
🌧️ Overview
-
Input:
train_data.csv: Hourly precipitation data with columnsdatetime(hourly resolution) andprecipitation(mm).test_data.csv: Daily precipitation data with the same column names but daily resolution (datetimeat 00:00:00 for each day).- (Optional)
params.csv: Parameters for the semi-parametric Bernoulli-Gamma mode. - (Optional)
seasons.csv: User-defined climatological seasons because default seasons are DJF, MAM, JJA, SON.
-
Output:
- A pandas DataFrame (or CSV) with hourly precipitation disaggregated from the daily values in
test_data, using statistical patterns learned fromtrain_data.
- A pandas DataFrame (or CSV) with hourly precipitation disaggregated from the daily values in
✨ Features
- Disaggregates daily totals into 24-hour precipitation series.
- Preserves sub-daily maxima in reconstructed data.
- Season-aware (DJF, MAM, JJA, SON) to capture seasonal variability.
- Combines comonotonicity with KNN-based iterative adjustments.
- Suitable for hydrological modeling and climate studies.
- Optional semi-parametric Bernoulli-Gamma mode.
- Optional enhanced autocorrelation refinement via permutations.
📦 Installation
From PyPI (recommended)
pip install pyqcoda
From Github
git clone https://github.com/carloscorreag/pyqcoda.git
cd pyqcoda
pip install .
🚀 Usage examples
🔹 1. Standard mode (default)
import pandas as pd
from pyqcoda import pyqcoda
# 1. Load your training (hourly) and testing (daily) datasets
df_train = pd.read_csv("train_data.csv", index_col=0, parse_dates=True)
df_test = pd.read_csv("test_data.csv", index_col=0, parse_dates=True)
# 2. Instantiate pyqcoda and disaggregate
qc = pyqcoda()
simulated_series = qc.disaggregate(df_train, df_test)
# 3. Convert results to hourly DataFrame
df_hourly = qc.get_hourly_dataframe(simulated_series)
# 4. Save output
df_hourly.to_csv("disaggregated_output.csv")
print("Hourly disaggregated precipitation saved to disaggregated_output.csv")
🔹 2. Semi-parametric mode (Load Bernoulli-Gamma params with CSV)
This mode uses fitted Bernoulli-Gamma distributions instead of the empirical transformation.
import pandas as pd
from pyqcoda import pyqcoda
params_df = pd.read_csv("params.csv")
# Convert to dictionary required by pyqcoda
params = {}
for _, row in params_df.iterrows():
season = row["season"]
duration = int(row["duration"])
params.setdefault(season, {})
params[season][duration] = {
"p0": row["p0"],
"shape": row["shape"],
"scale": row["scale"]
}
qc = pyqcoda()
simulated_series = qc.disaggregate(
df_train,
df_test,
semi_parametrical_mode=params
)
df_hourly = qc.get_hourly_dataframe(simulated_series)
df_hourly.to_csv("disaggregated_output.csv")
print("Hourly disaggregated precipitation saved to disaggregated_output.csv")
📄 Format of params.csv
The file must contain one row per:
season (DJF, MAM, JJA, SON) duration (1, 2, 6, 12, 24)
Example params.csv
season,duration,p0,shape,scale
DJF,24,0.3,2.1,5.0
DJF,1,0.5,1.2,2.0
DJF,2,0.45,1.5,2.5
DJF,6,0.4,2.0,3.0
DJF,12,0.35,2.3,4.0
MAM,24,0.25,2.5,4.5
MAM,1,0.4,1.8,2.2
🔹 3. Custom seasons (user-defined climatological seasons)
By default, pyqcoda uses standard climatological seasons:
- DJF (Dec–Jan–Feb)
- MAM (Mar–Apr–May)
- JJA (Jun–Jul–Aug)
- SON (Sep–Oct–Nov)
However, users can define custom seasonal partitions using a CSV file, in a way fully consistent with the params.csv workflow.
import pandas as pd
from pyqcoda import pyqcoda
seasons_df = pd.read_csv("seasons.csv")
# Seasons mapping
seasons = {}
for _, row in seasons_df.iterrows():
season = row["season"]
month = int(row["month"])
seasons.setdefault(season, []).append(month)
qc = pyqcoda()
simulated_series = qc.disaggregate(
df_train,
df_test,
seasons_dict=seasons
)
df_hourly = qc.get_hourly_dataframe(simulated_series)
df_hourly.to_csv("disaggregated_output.csv")
print("Hourly disaggregated precipitation saved to disaggregated_output.csv")
📄 seasons.csv format
The file must define a mapping between:
season→ custom season namemonth→ month number (1–12)
Each month must belong to exactly one season.
Example
season,month
WET,10
WET,11
WET,12
WET,1
WET,2
WET,3
DRY,4
DRY,5
DRY,6
DRY,7
DRY,8
DRY,9
- All 12 months (1–12) must be assigned exactly once.
- Season names in seasons.csv must match those used in params.csv if using semi-parametric mode. For example, if user define WET and DRY seasons, params.csv must contain them:
season,duration,p0,shape,scale
WET,24,0.5,2.1,5.0
WET,1,0.5,1.2,2.0
WET,2,0.5,1.5,2.5
WET,6,0.5,2.0,3.0
WET,12,0.5,2.3,4.0
DRY,24,0.25,2.5,4.5
DRY,1,0.25,1.3,2.5
DRY,2,0.25,1.7,3
DRY,6,0.25,1.9,4
DRY,12,0.25,2.2,4.2
- Overlapping or missing months will raise an error.
- This feature is fully optional: if seasons_dict=None, default climatological seasons are used.
🔹 4. Enhanced autocorrelation refinement (permutations mode)
pyqcoda includes an optional advanced refinement step designed to improve the temporal structure of the reconstructed hourly precipitation series, specifically targeting lag-1 autocorrelation.
This mode applies a local permutation-based optimization over short hourly windows while preserving:
- Daily totals (
P24) - Sub-daily maxima constraints (
PMAX1H,PMAX2H,PMAX6H,PMAX12H) - Physical consistency rules
What this mode does
When enabled, the algorithm:
- Selects short rolling windows (typically 3–5 hours)
- Generates permutations of values within each window
- Evaluates each candidate series using:
- Sub-daily maxima preservation
- Constraint consistency
- Lag-1 autocorrelation improvement
- Keeps the configuration that maximizes temporal coherence
How to use
Enable the mode by setting use_permutations=True in disaggregate:
from pyqcoda import pyqcoda
qc = pyqcoda()
simulated_series = qc.disaggregate(
df_train,
df_test,
use_permutations=True
)
df_hourly = qc.get_hourly_dataframe(simulated_series)
df_hourly.to_csv("disaggregated_output.csv")
print("Hourly disaggregated precipitation saved to disaggregated_output.csv")
🔧 Requirements
- Python 3.7+
- pandas ≥ 1.2.4
- numpy ≥ 1.21.6
- scikit-learn ≥ 1.0.2
📄 License
This project is licensed under the MIT License — see the LICENSE file for details.
📖 Citation
Correa Guinea, C. (2025). pyqcoda: Temporal disaggregation of daily precipitation into hourly using Q-CODA. DOI:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyqcoda-1.0.4.tar.gz.
File metadata
- Download URL: pyqcoda-1.0.4.tar.gz
- Upload date:
- Size: 8.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80927a498b7c4ec9bc1cbd7f70b85d972a3a83600a8fe8d506dc831249666303
|
|
| MD5 |
f132cf75e77c29b82ed003c0ffdbbd90
|
|
| BLAKE2b-256 |
9a26acc08dad968ba3141c940d7a0d88d637631bcedc51c41a690fdff2384397
|
File details
Details for the file pyqcoda-1.0.4-py3-none-any.whl.
File metadata
- Download URL: pyqcoda-1.0.4-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75f2eaec749b9a141b0711607d4c8b98c92d5bef91bc9c4bcf8ecd070756de21
|
|
| MD5 |
dfa8adb00060637b462e77ea1c9c29b4
|
|
| BLAKE2b-256 |
4c76d1a21b97ab942ce2d37465bd5843793184b038892181647d667a607ea22d
|