MELODIST: MEteoroLOgical observation time series DISaggregation Tool
Project description
MELODIST - An open-source MEteoroLOgical observation time series DISaggregation Tool
Welcome to MELODIST - MEteoroLOgical observation time series DISaggregation Tool
MELODIST is an open-source toolbox written in Python for disaggregating daily meteorological time series to hourly time steps. It is licensed under GPLv3 (see license file). The software framework consists of disaggregation functions for each variable including temperature, humidity, precipitation, shortwave radiation, and wind speed. These functions can simply be called from a station object, which includes all relevant information about site characteristics. The data management of time series is handled using data frame objects as defined in the pandas package. In this way, input and output data can be easily prepared and processed. For instance, the pandas package is data i/o capable and includes functions to plot time series using the matplotlib library.
An example file is provided along the package itself as a Jupyter notebook. This example demonstrates the usage of MELODIST for all variables. First, a station object is created providing some basic details on the site’s characteristics (e.g., latitude and longitude are relevant for radiation disaggregation). Once the station object is defined and filled with data, each disaggregation step is done through calling the designated function specified for each variable. Each of these functions requires a method
argument and if needed additional parameters to work properly. Some of these methods (see below) require additional statistical evaluations of hourly time series prior to the disaggregation procedure. This information is stored in a station statistics object that is associated to the station object (see example file for further details).
Station object
In the framework of MELODIST a station object includes all relevant information including metadata and time series. A station is generated using the constructor method:
s = melodist.Station(lon=longitude, lat=latitude, timezone=timezone)
Data is simply added by assignment (e.g., the data frame data_obs_daily
):
s.data_daily = data_obs_daily
A station statistics object can be generated in a similar manner. As station statistics are derived through analysing hourly observations for calibration, a reference to the data frame including hourly observations is given:
s.statistics = melodist.StationStatistics(data_obs_hourly)
Statistics can be derived for each variable by calling the respective function of the statistics object s.statistics
: calc_wind_stats()
, calc_humidity_stats()
, calc_temperature_stats()
, calc_precipitation_stats()
, and calc_radiation_stats
.
Naming convention for dataframe columns
MELODIST expects exact naming conventions for the time series provided in pandas dataframes. Please find the specification of column names below:
temp
: Temperature [K]precip
: Precipitation [mm/time step]glob
: Global (shortwave) radiation [W/m2]hum
: Relative humidity [%]wind
: Wind speed [m/s]ssd
: Sunshine duration [min]
For daily data, additional columns need to be specified (if applicable):
temp
: Average temperature [K]tmin
: Minimum temperature [K]tmax
: Maximum temperature [K]hum
: Average humidity [%]hum_min
: Minimum humidity [%]hum_max
: Maximum humidity [%]ssd
: Sunshine duration [h]
Please note that the dataframe's index must contain datetime values.
Disaggregation methods
The Station
class provides functions to perform the disaggregation procedures for each variable: disaggregate_temperature()
, disaggregate_humidity()
, disaggregate_wind()
, disaggregate_radiation()
, and disaggregate_precipitation()
. Moreover, an interpolation approach is also available using the interpolate()
function.
Hint: It is worth noting that each of the implemented disaggregation methods is directly accessible, e.g., melodist.precipitation.disagg_prec()
. In this case all relevant parameters (e.g., those derived through calibrations) need to be provided in the function call. This method-specific call of functions is not necessary if a station and the corresponding station statistics object is defined. Thus, it is recommended to define objects and to perform the disaggregation procedures using the object’s methods. Also, the names and signatures of these functions are likely subject to changes in future versions of MELODIST.
Please find below a list of available disaggregation methods for each variable which can be specified in the respective disaggregation methods of a Station
object:
Temperature
method='sine_min_max'
(T1): standard sine redistribution; preserves Tmin and Tmax but not Tmean.method='sine_mean'
: sine redistribution; preserves Tmean and the diurnal temperature range (Tmax – Tmin) but not Tmin and Tmax.method='mean_course_min_max'
: redistribute following a prescribed temperature course calculated from hourly observations; preserves Tmin and Tmax.method='mean_course_mean'
: redistribute following a prescribed temperature course calculated from hourly observations; preserves Tmean and the diurnal temperature range.- Possible options for
min_max_time
are:'fix'
(T1a): The diurnal course of temperature is fixed without any seasonal variations.'sun_loc'
(T1b): The diurnal course of temperature is modelled based on sunrise, noon and sunset calculations.'sun_loc_shift'
(T1c): This option activates empirical corrections of the ideal course modelled bysun_loc
(requires callingcalc_temperature_stats()
prior to the disaggregation).
- An optional parameter
mod_nighttime
(T1d, bool, default:False
) allows one to apply a linear interpolation of night time values, which proves preferable during polar nights.
Humidity
method='equal'
: duplicate mean daily humidity for the 24 hours of the day.method='minimal'
(H1): The dew point temperature is set to the minimum temperature on that day.method='dewpoint_regression'
(H2): Based on hourly observations, a regression approach is applied to calculate daily dew point temperature. Regression parameters must be specified (which is automatically done ifcalc_humidity_stats()
is called prior to disaggregation).method='linear_dewpoint_variation'
(H3): This method extends H2 through linearly varying dew point temperature between consecutive days. The parameterkr
needs to be specified (kr=6
if monthly radiation exceeds 100 W/m2 elsekr=12
).method='min_max'
(H4): this method requires minimum and maximum relative humidity for each day.method='month_hour_precip_mean'
: calculate hourly humidity from categorical [month, hour, precip(y/n)] mean values derived from observations.
Wind speed
method='equal'
(W1): If this method is chosen, the daily average wind speed is assumed to be valid for each hour on that day.method='cosine'
(W2): The cosine function option simulates a diurnal course of wind speed and requires calibration (calc_wind_stats()
).method='random'
(W3): This option is a stochastic method that draws random numbers to disaggregate wind speed taking into account the daily average (no parameter estimation required).
Radiation
method='pot_rad'
(R1): This method allows one to disaggregate daily averages of shortwave radiation using hourly values of potential (clear-sky) radiation calculated for the location of the station.method='pot_rad_via_ssd'
(R2): If daily sunshine recordings are available, the Angstrom model is applied to transform sunshine duration to shortwave radiation.method='pot_rad_via_bc'
(R3): In this case, the Bristow-Campbell model is applied which relates minimum and maximum temperature to shortwave radiation.method='mean_course'
: hourly radiation follows an observed average course (calculated for each month) while preserving the daily mean.
Precipitation
method='equal'
(P1): In order to derive hourly from daily values, the daily total is simply divided by 24 resulting in an equal distribution.method='cascade'
(P2): The cascade model is more complex and requires a parameter estimation method (calc_precipitation_stats()
). Statistics can be calculated using different options (parameters). Using the keywordmonths
, the seasons for which the statistics will be calculated independently can be specified (see example file). The keywordpercentile
allows one to adjust the threshold to separate precipitation intensities into two classes (low and high) for building the parameters. The default value is 50% (median). An additional optional argumentavg_stats
is used to decide whether statistics of all cascade levels will be averaged (default isTrue
). All options previously listed are optional and can be changed to tune the disaggregation results. A new feature also allows for working with 5 minutes precipitation data. Both functionsdisagg_prec_cascade()
andaggregate_precipitation()
offer switching to 5min data by settinghourly=False
in the function calls. Moreover, the former function allows for different level configurations (levels
can be set to 9 (standard), 10, or 11, depending on the number of branching levels to be considered in the cascade model). The usage of the cascade model for sub-hourly precipitation is demonstrated in a Jupyter notebook (see/examples/precip5min_example.ipynb
).method='masterstation'
(P3). If hourly values are available for another site in the vicinity of the station considered, the cumulative sub-daily mass curve can be transferred from the station that provides hourly values to the station of interest.
Utilities
Among other features the melodist.util
module includes some functions that might be useful for data analyses:
-
detect_gaps(dataframe, timestep, print_all=False, print_max=5, verbose=True)
can be used to find gaps in the data frame. A gap will be detected if any increment of time is not equal to the specified time step (in seconds). -
Some methods require time series with full days (24 h) only.
drop_incomplete_days(dataframe)
drops heading and tailing days if they are not complete (0-23h). -
For testing purposes an aggregation function is provided which aggregates hourly time series (data frames) to daily time series taking into account the characteristics of each meteorological variable (e.g., mean value for precipitation, daily total for precipitation, ...):
daily_from_hourly()
Data input/output
MELODIST includes a feature to read and save parameter settings for all disaggregation methods in order to transfer settings or to continue work at a later time. This feature is based on the JSON format which provides a flexible and easily readable ASCII file format for different applications.
To save MELODIST parameters included in station statistics object you can simply call the to_json(filename)
method of this object. At any time it is possible to recall this settings by creating a new station statistics object based on the settings stored in that file:
new_stationstatistics_object = melodist.StationStatistics.from_json(filename)
Since MELODIST is based on pandas, numerous ways to import and export pandas data frames exist. The to_csv()
and read_csv()
functions of pandas are ideal to load and save time series without any restriction with respect to MELODIST applications.
MELODIST has also some additional specific data input/output capabilities in melodist.data_io
, including functions to read data provided by the national weather services of the Netherlands (read_single_knmi_file()
, read_knmi_dataset()
) and Germany (read_dwd()
). Moreover, the SMET format is supported for reading and writing (read_smet()
, write_smet()
). This format is used in the MeteoIO library.
References
Förster, K., Hanzer, F., Winter, B., Marke, T., and Strasser, U.: An open-source MEteoroLOgical observation time series DISaggregation Tool (MELODIST v0.1.1), Geosci. Model Dev., 9, 2315-2333, doi:10.5194/gmd-9-2315-2016, 2016.
Hanzer, F., Förster, K., Nemec, J., and Strasser, U.: Projected cryospheric and hydrological impacts of 21st century climate change in the Ötztal Alps (Austria) simulated using a physically based approach, Hydrol. Earth Syst. Sci., 22, 1593-1614, doi:10.5194/hess-22-1593-2018, 2018.
Version history
- 0.1.5 (03 Jul 2023):
- Fix compatibility with recent numpy/pandas versions
- 0.1.4 (05 May 2020):
- Upgraded to warrant pandas compatibility
- 0.1.3 (21 Nov 2018):
- Sub-hourly precipitation disaggregation capabilities added along with a new Jupyter notebook
- bugfix in
precipitation.py
- speed-up of cascade statistics
- 0.1.2 (21 Dec 2017):
- new disaggregation methods:
'sine_mean'
,'mean_course_min_max'
and'mean_course_mean'
for temperature,'month_hour_precip_mean'
for humidity and'mean_course'
for radiation - add option to preserve daily mean values in humidity disaggregation
- add option to calculate and apply Angstroem/Bristow-Campbell parameters monthly or seasonally instead of for all values
- bugfix in calculation of potential radiation
- new disaggregation methods:
- 0.1.1 (03 Jun 2016):
- data type corrections in order to avoid conversion warnings
- new functions for the estimation of Angstrom and Bristow-Campbell parameters
- handling of gaps in daily data fixed
- speed-up of radiation computations
- updates to example.py
- pandas 0.18 API compatibility
- 0.1.0 (01 Mar 2016): First version of MELODIST
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file melodist-0.1.6.tar.gz
.
File metadata
- Download URL: melodist-0.1.6.tar.gz
- Upload date:
- Size: 55.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c40f65747388b7893bf81fc65bd1cfd6c51475133cd1e88fff46b8e977fb99bb |
|
MD5 | e1bebd0e9738989c75ff7338762a2813 |
|
BLAKE2b-256 | 589d397294593fbee4ac5676ea5a9774c569209ad00016dc73b4c1e3f3181359 |
File details
Details for the file melodist-0.1.6-py3-none-any.whl
.
File metadata
- Download URL: melodist-0.1.6-py3-none-any.whl
- Upload date:
- Size: 57.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0beb7f42815b63b8ee035b9eff63abae575e791e98d81f0c15a4ef10afbc7ea1 |
|
MD5 | 86a8f560c169119a75cef15731f5f3ee |
|
BLAKE2b-256 | a3d8eb8bb10188745b6a0203485501928320cf4e7d8ca210a43d77b3e4c0eb5c |