PeakWeather: MeteoSwiss Weather Station Measurements for Spatiotemporal Deep Learning.
Project description
PeakWeather
This repository contains the code to load and preprocess the PeakWeather dataset. The dataset is hosted on Hugging Face
and presented in the paper
PeakWeather: MeteoSwiss Weather Station Measurements for Spatiotemporal Deep Learning
Daniele Zambon², Michele Cattaneo¹, Ivan Marisca², Jonas Bhend¹, Daniele Nerini¹, Cesare Alippi² ³
¹ MeteoSwiss, ² USI, IDSIA, ³ PoliMi
Refer to peakweather.readthedocs.io for the documentation.
Quickstart:
Install
Option 1: Clone and install locally
git clone https://github.com/MeteoSwiss/PeakWeather.git
cd PeakWeather
pip install . # Without extras
pip install .[topography] # Install with extras
Option 2: Install via pip as a package
pip install git+https://github.com/MeteoSwiss/PeakWeather.git # Install normal package
pip install "peakweather[topography] @ git+https://github.com/MeteoSwiss/PeakWeather@main" # Install with extras
Download the data from Hugging Face
from peakweather.dataset import PeakWeatherDataset
# Download the data in the current working directory
ds = PeakWeatherDataset(root=None)
Load pre-downloaded data
from peakweather.dataset import PeakWeatherDataset
ds = PeakWeatherDataset(root=<PATH_TO_DATA>)
Get observations
# For a single station, all parameters
ds.get_observations(stations='KLO')
# For two stations, all parameters
ds.get_observations(stations=['KLO', 'GRO'])
# For specific parameters
ds.get_observations(stations='KLO', parameters=['pressure', 'temperature'])
| datetime | ('KLO', 'pressure') | ('KLO', 'temperature') |
|---|---|---|
| 2017-01-01 00:00:00+00:00 | 977.8 | -3.3 |
| 2017-01-01 00:10:00+00:00 | 977.7 | -3.5 |
| 2017-01-01 00:20:00+00:00 | 977.6 | -3.5 |
| 2017-01-01 00:30:00+00:00 | 977.5 | -3.6 |
| 2017-01-01 00:40:00+00:00 | 977.3 | -3.5 |
| ... |
# Get observations for a specific time frame
ds.get_observations(stations='KLO',
parameters=['wind_speed', 'wind_direction'],
first_date='2024-08-01 16:32',
last_date='2024-08-01 17:26')
| datetime | ('KLO', 'wind_speed') | ('KLO', 'wind_direction') |
|---|---|---|
| 2024-08-01 16:40:00+00:00 | 3.9 | 219 |
| 2024-08-01 16:50:00+00:00 | 2.5 | 225 |
| 2024-08-01 17:00:00+00:00 | 2.9 | 231 |
| 2024-08-01 17:10:00+00:00 | 3.1 | 259 |
| 2024-08-01 17:20:00+00:00 | 2.8 | 237 |
Detailed Usage
For detailed usage and parameter descriptions, please refer to the docstring of the PeakWeatherDataset class, which provides extended documentation on its functionality and options.
Re-sampling
ds = PeakWeatherDataset(
root="data", # Path to the dataset
pad_missing_values=True, # Pad missing values with NaN
years=None, # Years to include in the dataset (None for all)
parameters=None, # Parameters to include in the dataset (None for all)
extended_topo_vars="none", # Optional extended topographic variables
extended_nwp_pars="none", # Optional extended NWP model (ICON) variables
imputation_method="zero", # Method for imputing missing values
freq="h", # Frequency of the data (e.g., "h" for hourly)
compute_uv=True, # Compute u and v components of wind
station_type="meteo_station", # Which station type to load (None for all)
aggregation_methods={'temperature': 'mean'} # Use specific aggregation
)
ds.parameters_table["aggregation"]
The above dataset is initialized with hourly frequency. The 10-minute values are aggregated with the default methods below:
| name | aggregation |
|---|---|
| humidity | last |
| precipitation | sum |
| pressure | last |
| sunshine | sum |
| temperature | last |
| wind_direction | circ_mean |
| wind_gust | max |
| wind_speed | mean |
| wind_u | mean |
| wind_v | mean |
Notice, however, how we can change the aggregation method with the aggregation_methods argument. In this case, the temperature will be averaged over the previous hour.
Basic information
We can obtain some basic information about the content of the dataset as follows:
# Get printable representation of the dataset
print(ds)
# Show dataset information
print(f"Number of time steps: {ds.num_time_steps}")
print(f"Number of stations: {ds.num_stations}")
print(ds.stations_table.head(10))
print(f"Number of parameters: {ds.num_parameters}")
print(f"Parameters")
ds.show_parameters_description()
# Show data
print(f"Observations shape: {ds.observations.shape}")
print(ds.observations.head(10))
# Show the amount of missing values considering stations
# equipped with the respective sensor
print(ds.missing_values)
We can get observations for a specific station and parameter as arrays:
# Get wind gust and direction for station KLO
klo_data = ds.get_observations(stations="KLO",
parameters=["wind_gust", "wind_direction"],
as_numpy=True)
print(f"KLO data shape: {klo_data.shape}")
print(f"KLO maximum wind gust: {klo_data[..., 0].max():.2f} m/s")
Time series windowing
We can obtain the data for a sliding window of a size $W$ and horizon $H$.
window_size = 12
lead_times = 3
sub_windows = ds.get_windows(window_size=window_size,
horizon_size=lead_times,
stations=ds.stations[:10],
parameters=["wind_speed", "wind_direction"],
first_date="2020-01-01",
last_date="2022-01-01")
print(f"Windows x shape: {sub_windows.x.shape}")
print(f"Windows mask_x shape: {sub_windows.mask_x.shape}")
print(f"Windows y shape: {sub_windows.y.shape}")
print(f"Windows mask_y shape: {sub_windows.mask_y.shape}")
The object returned contains x of shape $[\text{windows}, W,\text{stations}, \text{params}]$ and mask_x of the same shape, representing the input windows. Associated with them, there are y and mask_y of shape $[\text{windows}, H,\text{stations}, \text{params}]$ representing the future quantities.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file peakweather-0.2.1.tar.gz.
File metadata
- Download URL: peakweather-0.2.1.tar.gz
- Upload date:
- Size: 22.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9932b5b59b7e07f536f83f28e26f7881bdb7529f2831bb1eacc73c558b183d0
|
|
| MD5 |
419e0fce1ac3132ef6c6e855dfca05ed
|
|
| BLAKE2b-256 |
30ca0b96e631bf02d3e81854e7c0d34b3226ff303b8ce944072a0a0eaaec1a35
|
Provenance
The following attestation bundles were made for peakweather-0.2.1.tar.gz:
Publisher:
CI_publish.yaml on MeteoSwiss/PeakWeather
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
peakweather-0.2.1.tar.gz -
Subject digest:
e9932b5b59b7e07f536f83f28e26f7881bdb7529f2831bb1eacc73c558b183d0 - Sigstore transparency entry: 1115468586
- Sigstore integration time:
-
Permalink:
MeteoSwiss/PeakWeather@1831c97236879ebd8356b46a167b66375b10a0a7 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/MeteoSwiss
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI_publish.yaml@1831c97236879ebd8356b46a167b66375b10a0a7 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file peakweather-0.2.1-py3-none-any.whl.
File metadata
- Download URL: peakweather-0.2.1-py3-none-any.whl
- Upload date:
- Size: 22.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5823950eb31afeef8c1af16db72fe787869f7e23335fa13af9bc93deae2726c2
|
|
| MD5 |
d7c60052a4fd6c8d5b5ceeab321128c1
|
|
| BLAKE2b-256 |
fb8dc1ab777d3e5df8d1956eb7f31a1f236d608989ff77704c153a37359ea72d
|
Provenance
The following attestation bundles were made for peakweather-0.2.1-py3-none-any.whl:
Publisher:
CI_publish.yaml on MeteoSwiss/PeakWeather
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
peakweather-0.2.1-py3-none-any.whl -
Subject digest:
5823950eb31afeef8c1af16db72fe787869f7e23335fa13af9bc93deae2726c2 - Sigstore transparency entry: 1115468644
- Sigstore integration time:
-
Permalink:
MeteoSwiss/PeakWeather@1831c97236879ebd8356b46a167b66375b10a0a7 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/MeteoSwiss
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI_publish.yaml@1831c97236879ebd8356b46a167b66375b10a0a7 -
Trigger Event:
workflow_dispatch
-
Statement type: