Skip to main content

Download MeteoSwiss Open Government Data and convert to Parquet / Delta tables

Project description

foehn

MeteoSwiss Open Data → Parquet → Databricks Delta tables

PyPI Latest Release Python Versions MIT License Monthly Downloads


foehn downloads every MeteoSwiss OGD collection via the STAC API, converts CSV/TXT to Parquet with Polars, and optionally ingests everything into Databricks Unity Catalog Delta tables on a daily schedule.

Why foehn?

  • 20+ collections in one command — weather stations, radar, hail maps, forecasts, climate scenarios, and more
  • Significantly smaller on disk — columnar Parquet with Snappy compression vs. raw CSVs
  • Incremental by default — only re-downloads files that changed since your last run, tracked via _last_run.json
  • No Spark required locally — download + conversion uses Polars only; Spark is optional for Delta ingestion
  • Ships a Declarative Automation Bundle — ready-to-deploy daily job and historical backfill, no pipeline config needed

Quick start

pip install foehn
foehn

Recent data (Jan 1 → yesterday) is downloaded and converted to Parquet under ./data/meteoswiss/.


Collections

Key Description Format
smn Automatic weather stations (A1) CSV → Parquet
smn_precip Automatic precipitation stations (A2) CSV → Parquet
smn_tower Tower stations (A3) CSV → Parquet
nime Manual precipitation stations (A5) CSV → Parquet
tot Totaliser precipitation (A6) CSV → Parquet
obs Visual observations (A8) CSV → Parquet
pollen Pollen stations (A7) CSV → Parquet
phenology Phenological observations (A9) CSV → Parquet
nbcn Homogeneous climate stations (C1) CSV → Parquet
nbcn_precip Homogeneous precipitation (C2) CSV → Parquet
climate_normals Station normals 1961–1990 / 1991–2020 (C6) TXT → Parquet
climate_normals_* Spatial normals (C7) NetCDF / GeoTIFF
surface_derived_grid Spatial analyses — precip, temp, sunshine (C3/C4) NetCDF
satellite_derived_grid Spatial analyses — radiation, clouds (C5) NetCDF
climate_scenarios CH2025 local scenarios (C8) CSV → Parquet
climate_scenarios_grid CH2025 gridded scenarios (C9) NetCDF
hail_hazard_* Hail hazard maps NetCDF / ZIP
forecast_local Local point forecasts (E4) CSV → Parquet
forecast_icon_ch1/ch2 ICON-CH1/CH2-EPS (E2/E3) GRIB2 (opt-in)
radar_precip/hail Precipitation + hail radar (D1/D3) HDF5 (opt-in)

Installation

From PyPI:

pip install foehn

From source:

git clone https://github.com/kayhendriksen/foehn
cd foehn
pip install -e .

With Databricks extras (PySpark + Delta):

pip install "foehn[databricks]"

Requires Python ≥ 3.10.


Python API

Use foehn directly from notebooks or scripts:

import foehn

# List all available collections
foehn.list_collections()
# [{'category': 'CSV', 'key': 'smn', 'collection_id': 'ch.meteoschweiz.ogd-smn'}, ...]

# Download a single collection
foehn.fetch("smn", data_dir="./data/meteoswiss")

# Download with specific time slices
foehn.fetch("smn", data_types=["historical", "recent"])

# Convert downloaded CSVs to Parquet
foehn.convert("smn", data_dir="./data/meteoswiss")

CLI reference

foehn [options]

Time range — recent (Jan 1 this year → yesterday) is always included; flags extend it:

Flag Description
(none) Recent only — Jan 1 this year → yesterday, updated daily at 12 UTC
--historical Also fetch full archive — start of measurement → Dec 31 last year
--now Also fetch realtime slice — yesterday 12 UTC → now, 10-min updates
--all All three slices: historical + recent + now

Behaviour:

Flag Description
--full-refresh Ignore incremental tracking, re-download everything
--convert-only Convert existing CSVs to Parquet without downloading

Output:

Flag Description
--list List available collections and exit
--grids Also fetch GRIB2, radar HDF5, NetCDF, GeoTIFF (large)
--no-parquet Skip conversion, keep raw CSVs only
--data-dir PATH Output root (default: ./data/meteoswiss)

Parquet files land in <data-dir>/parquet/<collection>/.


Environment variables

Settings can also be configured via environment variables. CLI flags always take precedence.

Variable Equivalent Description
FOEHN_DATA_DIR --data-dir Root data directory
FOEHN_FULL_REFRESH --full-refresh Set to 1, true, or yes to ignore incremental tracking

Databricks pipeline

The recommended setup uses Declarative Automation Bundles.

1. Set variables:

export BUNDLE_VAR_host=https://adb-xxx.azuredatabricks.net
export BUNDLE_VAR_alert_email=you@example.com

2. Deploy:

pip install databricks-cli
databricks bundle validate
databricks bundle deploy -t prod

This deploys two jobs:

  • foehn_daily — runs at 13:30 UTC every day; downloads recent data and refreshes Delta tables
  • foehn_historical — paused by default; trigger manually for first run or on Jan 1 for the annual archive slice

Data sources

STAC API https://data.geo.admin.ch/api/stac/v1
Documentation https://opendatadocs.meteoswiss.ch
MeteoSwiss OGD https://github.com/MeteoSwiss/opendata

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foehn-0.2.3.tar.gz (41.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

foehn-0.2.3-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file foehn-0.2.3.tar.gz.

File metadata

  • Download URL: foehn-0.2.3.tar.gz
  • Upload date:
  • Size: 41.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for foehn-0.2.3.tar.gz
Algorithm Hash digest
SHA256 163f91777eb91c0d1e19a4f8b7feefb9f19762b37dcf0a1d74651f1643a90b60
MD5 d98a5e589546f7d3306dcd08aaa894cd
BLAKE2b-256 a171e60db34c460d4be5ff05de01d221961087c21e4f5b5c4b497964233aaab3

See more details on using hashes here.

Provenance

The following attestation bundles were made for foehn-0.2.3.tar.gz:

Publisher: publish.yml on kayhendriksen/foehn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file foehn-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: foehn-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for foehn-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 793b157be7936baf32b53f2403a048f6b543431392a931b6da8d8bc1b5953a61
MD5 efa50d6db399f5eb4590f4b5f356d129
BLAKE2b-256 7b0ccdd39ddce3f08119d0ae9a8e39be9651259cf8153fd6953be721688d7f98

See more details on using hashes here.

Provenance

The following attestation bundles were made for foehn-0.2.3-py3-none-any.whl:

Publisher: publish.yml on kayhendriksen/foehn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page