Download MeteoSwiss Open Government Data and convert to Parquet / Delta tables
Project description
MeteoSwiss Open Data → Parquet → Databricks Delta tables
foehn downloads every MeteoSwiss OGD collection via the STAC API, converts CSV/TXT to Parquet with Polars, and optionally ingests everything into Databricks Unity Catalog Delta tables on a daily schedule.
Why foehn?
- 20+ collections in one command — weather stations, radar, hail maps, forecasts, climate scenarios, and more
- Significantly smaller on disk — columnar Parquet with Snappy compression vs. raw CSVs
- Incremental by default — only re-downloads files that changed since your last run, tracked via
_last_run.json - No Spark required locally — download + conversion uses Polars only; Spark is optional for Delta ingestion
- Ships a Declarative Automation Bundle — ready-to-deploy daily job and historical backfill, no pipeline config needed
Quick start
pip install foehn
foehn
Recent data (Jan 1 → yesterday) is downloaded and converted to Parquet under ./data/meteoswiss/.
Collections
| Key | Description | Format |
|---|---|---|
smn |
Automatic weather stations (A1) | CSV → Parquet |
smn_precip |
Automatic precipitation stations (A2) | CSV → Parquet |
smn_tower |
Tower stations (A3) | CSV → Parquet |
nime |
Manual precipitation stations (A5) | CSV → Parquet |
tot |
Totaliser precipitation (A6) | CSV → Parquet |
obs |
Visual observations (A8) | CSV → Parquet |
pollen |
Pollen stations (A7) | CSV → Parquet |
phenology |
Phenological observations (A9) | CSV → Parquet |
nbcn |
Homogeneous climate stations (C1) | CSV → Parquet |
nbcn_precip |
Homogeneous precipitation (C2) | CSV → Parquet |
climate_normals |
Station normals 1961–1990 / 1991–2020 (C6) | TXT → Parquet |
climate_normals_* |
Spatial normals (C7) | NetCDF / GeoTIFF |
surface_derived_grid |
Spatial analyses — precip, temp, sunshine (C3/C4) | NetCDF |
satellite_derived_grid |
Spatial analyses — radiation, clouds (C5) | NetCDF |
climate_scenarios |
CH2025 local scenarios (C8) | CSV → Parquet |
climate_scenarios_grid |
CH2025 gridded scenarios (C9) | NetCDF |
hail_hazard_* |
Hail hazard maps | NetCDF / ZIP |
forecast_local |
Local point forecasts (E4) | CSV → Parquet |
forecast_icon_ch1/ch2 |
ICON-CH1/CH2-EPS (E2/E3) | GRIB2 (opt-in) |
radar_precip/hail |
Precipitation + hail radar (D1/D3) | HDF5 (opt-in) |
Installation
From PyPI:
pip install foehn
From source:
git clone https://github.com/kayhendriksen/foehn
cd foehn
pip install -e .
With Databricks extras (PySpark + Delta):
pip install "foehn[databricks]"
Requires Python ≥ 3.10.
Python API
Use foehn directly from notebooks or scripts:
import foehn
# List all available collections
foehn.list_collections()
# [{'category': 'CSV', 'key': 'smn', 'collection_id': 'ch.meteoschweiz.ogd-smn'}, ...]
# Download a single collection
foehn.fetch("smn", data_dir="./data/meteoswiss")
# Download with specific time slices
foehn.fetch("smn", data_types=["historical", "recent"])
# Convert downloaded CSVs to Parquet
foehn.convert("smn", data_dir="./data/meteoswiss")
CLI reference
foehn [options]
Time range — recent (Jan 1 this year → yesterday) is always included; flags extend it:
| Flag | Description |
|---|---|
| (none) | Recent only — Jan 1 this year → yesterday, updated daily at 12 UTC |
--historical |
Also fetch full archive — start of measurement → Dec 31 last year |
--now |
Also fetch realtime slice — yesterday 12 UTC → now, 10-min updates |
--all |
All three slices: historical + recent + now |
Behaviour:
| Flag | Description |
|---|---|
--full-refresh |
Ignore incremental tracking, re-download everything |
--convert-only |
Convert existing CSVs to Parquet without downloading |
Output:
| Flag | Description |
|---|---|
--list |
List available collections and exit |
--grids |
Also fetch GRIB2, radar HDF5, NetCDF, GeoTIFF (large) |
--no-parquet |
Skip conversion, keep raw CSVs only |
--data-dir PATH |
Output root (default: ./data/meteoswiss) |
Parquet files land in <data-dir>/parquet/<collection>/.
Environment variables
Settings can also be configured via environment variables. CLI flags always take precedence.
| Variable | Equivalent | Description |
|---|---|---|
FOEHN_DATA_DIR |
--data-dir |
Root data directory |
FOEHN_FULL_REFRESH |
--full-refresh |
Set to 1, true, or yes to ignore incremental tracking |
Databricks pipeline
The recommended setup uses Declarative Automation Bundles.
1. Set variables:
export BUNDLE_VAR_host=https://adb-xxx.azuredatabricks.net
export BUNDLE_VAR_alert_email=you@example.com
2. Deploy:
pip install databricks-cli
databricks bundle validate
databricks bundle deploy -t prod
This deploys two jobs:
foehn_daily— runs at 13:30 UTC every day; downloads recent data and refreshes Delta tablesfoehn_historical— paused by default; trigger manually for first run or on Jan 1 for the annual archive slice
Data sources
| STAC API | https://data.geo.admin.ch/api/stac/v1 |
| Documentation | https://opendatadocs.meteoswiss.ch |
| MeteoSwiss OGD | https://github.com/MeteoSwiss/opendata |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file foehn-0.2.1.tar.gz.
File metadata
- Download URL: foehn-0.2.1.tar.gz
- Upload date:
- Size: 41.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f067a5b25f1e489f588a1bed48b6aab6e85deb6435e417543af487ebe5a3e081
|
|
| MD5 |
a6f002405fc51c15d4e5c4732c943c3f
|
|
| BLAKE2b-256 |
85a95fd46289c29f529d770cbc6b54a99dc58a9c90d707bf4ac3dcfc4704d9dc
|
Provenance
The following attestation bundles were made for foehn-0.2.1.tar.gz:
Publisher:
publish.yml on kayhendriksen/foehn
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
foehn-0.2.1.tar.gz -
Subject digest:
f067a5b25f1e489f588a1bed48b6aab6e85deb6435e417543af487ebe5a3e081 - Sigstore transparency entry: 1163587773
- Sigstore integration time:
-
Permalink:
kayhendriksen/foehn@8394decbe55f508166bf9991023d4abe4f3ff4f7 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/kayhendriksen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8394decbe55f508166bf9991023d4abe4f3ff4f7 -
Trigger Event:
release
-
Statement type:
File details
Details for the file foehn-0.2.1-py3-none-any.whl.
File metadata
- Download URL: foehn-0.2.1-py3-none-any.whl
- Upload date:
- Size: 17.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1379c91c52eb2c6f65fce8accfcd7a86de8fa1cc69b7f675ab090fc849a8fec
|
|
| MD5 |
193a12b9ccbc9aed92a3f9ffc0ddb67e
|
|
| BLAKE2b-256 |
3daab8801db646c858295c85b2158d3dab242172448c9819c62412f8a14de62c
|
Provenance
The following attestation bundles were made for foehn-0.2.1-py3-none-any.whl:
Publisher:
publish.yml on kayhendriksen/foehn
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
foehn-0.2.1-py3-none-any.whl -
Subject digest:
d1379c91c52eb2c6f65fce8accfcd7a86de8fa1cc69b7f675ab090fc849a8fec - Sigstore transparency entry: 1163587846
- Sigstore integration time:
-
Permalink:
kayhendriksen/foehn@8394decbe55f508166bf9991023d4abe4f3ff4f7 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/kayhendriksen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8394decbe55f508166bf9991023d4abe4f3ff4f7 -
Trigger Event:
release
-
Statement type: