Skip to main content

Download MeteoSwiss Open Government Data and convert to Parquet / Delta tables

Project description

foehn

MeteoSwiss Open Data → Parquet → Databricks Delta tables

PyPI Latest Release Python Versions MIT License Monthly Downloads


foehn downloads every MeteoSwiss OGD collection via the STAC API, converts CSV/TXT to Parquet with Polars, and optionally ingests everything into Databricks Unity Catalog Delta tables on a daily schedule.

Why foehn?

  • 20+ collections in one command — weather stations, radar, hail maps, forecasts, climate scenarios, and more
  • 5–10× smaller on disk — columnar Parquet with Zstd compression vs. raw CSVs
  • Incremental by default — only re-downloads files that changed since your last run, tracked via _last_run.json
  • No Spark required locally — download + conversion uses Polars only; Spark is optional for Delta ingestion
  • Ships a Declarative Automation Bundle — ready-to-deploy daily job and historical backfill, no pipeline config needed

Quick start

pip install foehn
foehn

Recent data (Jan 1 → yesterday) is downloaded and converted to Parquet under ./data/meteoswiss/.


Collections

Key Description Format
smn Automatic weather stations (A1) CSV → Parquet
smn_precip Automatic precipitation stations (A2) CSV → Parquet
smn_tower Tower stations (A3) CSV → Parquet
nime Manual precipitation stations (A5) CSV → Parquet
tot Totaliser precipitation (A6) CSV → Parquet
obs Visual observations (A8) CSV → Parquet
pollen Pollen stations (A7) CSV → Parquet
phenology Phenological observations (A9) CSV → Parquet
nbcn Homogeneous climate stations (C1) CSV → Parquet
nbcn_precip Homogeneous precipitation (C2) CSV → Parquet
climate_normals Station normals 1961–1990 / 1991–2020 (C6) TXT → Parquet
climate_normals_* Spatial normals (C7) NetCDF / GeoTIFF
surface_derived_grid Spatial analyses — precip, temp, sunshine (C3/C4) NetCDF
satellite_derived_grid Spatial analyses — radiation, clouds (C5) NetCDF
climate_scenarios CH2025 local scenarios (C8) CSV → Parquet
climate_scenarios_grid CH2025 gridded scenarios (C9) NetCDF
hail_hazard_* Hail hazard maps NetCDF / ZIP
forecast_local Local point forecasts (E4) CSV → Parquet
forecast_icon_ch1/ch2 ICON-CH1/CH2-EPS (E2/E3) GRIB2 (opt-in)
radar_precip/hail Precipitation + hail radar (D1/D3) HDF5 (opt-in)

Installation

From PyPI:

pip install foehn

From source:

git clone https://github.com/kayhendriksen/foehn
cd foehn
pip install -e .

With Databricks extras (PySpark + Delta):

pip install "foehn[databricks]"

Requires Python ≥ 3.10.


CLI reference

foehn [options]

Time range — recent (Jan 1 this year → yesterday) is always included; flags extend it:

Flag Description
(none) Recent only — Jan 1 this year → yesterday, updated daily at 12 UTC
--historical Also fetch full archive — start of measurement → Dec 31 last year
--now Also fetch realtime slice — yesterday 12 UTC → now, 10-min updates
--all All three slices: historical + recent + now

Behaviour:

Flag Description
--full-refresh Ignore incremental tracking, re-download everything
--convert-only Convert existing CSVs to Parquet without downloading

Output:

Flag Description
--grids Also fetch GRIB2, radar HDF5, NetCDF, GeoTIFF (large)
--no-parquet Skip conversion, keep raw CSVs only
--data-dir PATH Output root (default: ./data/meteoswiss)

Parquet files land in <data-dir>/parquet/<collection>/.


Databricks pipeline

The recommended setup uses Declarative Automation Bundles.

1. Set variables:

export BUNDLE_VAR_host=https://adb-xxx.azuredatabricks.net
export BUNDLE_VAR_alert_email=you@example.com

2. Deploy:

pip install databricks-cli
databricks bundle validate
databricks bundle deploy -t prod

This deploys two jobs:

  • foehn_daily — runs at 13:30 UTC every day; downloads recent data and refreshes Delta tables
  • foehn_historical — paused by default; trigger manually for first run or on Jan 1 for the annual archive slice

Data sources

STAC API https://data.geo.admin.ch/api/stac/v1
Documentation https://opendatadocs.meteoswiss.ch
MeteoSwiss OGD https://github.com/MeteoSwiss/opendata

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foehn-0.1.0.tar.gz (38.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

foehn-0.1.0-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file foehn-0.1.0.tar.gz.

File metadata

  • Download URL: foehn-0.1.0.tar.gz
  • Upload date:
  • Size: 38.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for foehn-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f326dea91eb3c28b16040e69cb745ac4fcd531ffb964e55520c3bf0169aa8ecc
MD5 ee2594f16e636fd0c0bb0e4971b88434
BLAKE2b-256 b9fbcdc4ca3c7f42582aa2eb218e256c731960c70a93c542c134190fdee7a498

See more details on using hashes here.

Provenance

The following attestation bundles were made for foehn-0.1.0.tar.gz:

Publisher: publish.yml on kayhendriksen/foehn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file foehn-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: foehn-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for foehn-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 05b611cebc3f300bf9824325f1bb8ac643e89a147c78d30cd3f97a759da30248
MD5 b1ca52dd0b7a4af5a9c014cfd1bdaaf6
BLAKE2b-256 8e79dfcc804721c74ec15181695da2fec989d8ffc413e290d58c011c9017af9e

See more details on using hashes here.

Provenance

The following attestation bundles were made for foehn-0.1.0-py3-none-any.whl:

Publisher: publish.yml on kayhendriksen/foehn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page