Skip to main content

Download MeteoSwiss Open Government Data and convert to Parquet / Delta tables

Project description

foehn

MeteoSwiss Open Data → Parquet → Databricks Delta tables

PyPI Latest Release Python Versions MIT License Monthly Downloads


foehn downloads every MeteoSwiss OGD collection via the STAC API, converts CSV/TXT to Parquet with Polars, and optionally ingests everything into Databricks Unity Catalog Delta tables on a daily schedule.

Why foehn?

  • 20+ collections in one command — weather stations, radar, hail maps, forecasts, climate scenarios, and more
  • Significantly smaller on disk — columnar Parquet with Snappy compression vs. raw CSVs
  • Incremental by default — only re-downloads files that changed since your last run, tracked via _last_run.json
  • No Spark required locally — download + conversion uses Polars only; Spark is optional for Delta ingestion
  • Ships a Declarative Automation Bundle — ready-to-deploy daily job and historical backfill, no pipeline config needed

Quick start

pip install foehn
foehn

Recent data (Jan 1 → yesterday) is downloaded and converted to Parquet under ./data/meteoswiss/.


Collections

Key Description Format
smn Automatic weather stations (A1) CSV → Parquet
smn_precip Automatic precipitation stations (A2) CSV → Parquet
smn_tower Tower stations (A3) CSV → Parquet
nime Manual precipitation stations (A5) CSV → Parquet
tot Totaliser precipitation (A6) CSV → Parquet
obs Visual observations (A8) CSV → Parquet
pollen Pollen stations (A7) CSV → Parquet
phenology Phenological observations (A9) CSV → Parquet
nbcn Homogeneous climate stations (C1) CSV → Parquet
nbcn_precip Homogeneous precipitation (C2) CSV → Parquet
climate_normals Station normals 1961–1990 / 1991–2020 (C6) TXT → Parquet
climate_normals_* Spatial normals (C7) NetCDF / GeoTIFF
surface_derived_grid Spatial analyses — precip, temp, sunshine (C3/C4) NetCDF
satellite_derived_grid Spatial analyses — radiation, clouds (C5) NetCDF
climate_scenarios CH2025 local scenarios (C8) CSV → Parquet
climate_scenarios_grid CH2025 gridded scenarios (C9) NetCDF
hail_hazard_* Hail hazard maps NetCDF / ZIP
forecast_local Local point forecasts (E4) CSV → Parquet
forecast_icon_ch1/ch2 ICON-CH1/CH2-EPS (E2/E3) GRIB2 (opt-in)
radar_precip/hail Precipitation + hail radar (D1/D3) HDF5 (opt-in)

Installation

From PyPI:

pip install foehn

From source:

git clone https://github.com/kayhendriksen/foehn
cd foehn
pip install -e .

With Databricks extras (PySpark + Delta):

pip install "foehn[databricks]"

Requires Python ≥ 3.10.


Python API

Use foehn directly from notebooks or scripts:

import foehn

# List all available collections
foehn.list_collections()
# [{'category': 'CSV', 'key': 'smn', 'collection_id': 'ch.meteoschweiz.ogd-smn'}, ...]

# Download a single collection
foehn.fetch("smn", data_dir="./data/meteoswiss")

# Download with specific time slices
foehn.fetch("smn", data_types=["historical", "recent"])

# Convert downloaded CSVs to Parquet
foehn.convert("smn", data_dir="./data/meteoswiss")

CLI reference

foehn [options]

Time range — recent (Jan 1 this year → yesterday) is always included; flags extend it:

Flag Description
(none) Recent only — Jan 1 this year → yesterday, updated daily at 12 UTC
--historical Also fetch full archive — start of measurement → Dec 31 last year
--now Also fetch realtime slice — yesterday 12 UTC → now, 10-min updates
--all All three slices: historical + recent + now

Behaviour:

Flag Description
--full-refresh Ignore incremental tracking, re-download everything
--convert-only Convert existing CSVs to Parquet without downloading

Output:

Flag Description
--list List available collections and exit
--grids Also fetch GRIB2, radar HDF5, NetCDF, GeoTIFF (large)
--no-parquet Skip conversion, keep raw CSVs only
--data-dir PATH Output root (default: ./data/meteoswiss)

Parquet files land in <data-dir>/parquet/<collection>/.


Environment variables

Settings can also be configured via environment variables. CLI flags always take precedence.

Variable Equivalent Description
FOEHN_DATA_DIR --data-dir Root data directory
FOEHN_FULL_REFRESH --full-refresh Set to 1, true, or yes to ignore incremental tracking

Databricks pipeline

The recommended setup uses Declarative Automation Bundles.

1. Set variables:

export BUNDLE_VAR_host=https://adb-xxx.azuredatabricks.net
export BUNDLE_VAR_alert_email=you@example.com

2. Deploy:

pip install databricks-cli
databricks bundle validate
databricks bundle deploy -t prod

This deploys two jobs:

  • foehn_daily — runs at 13:30 UTC every day; downloads recent data and refreshes Delta tables
  • foehn_historical — paused by default; trigger manually for first run or on Jan 1 for the annual archive slice

Data sources

STAC API https://data.geo.admin.ch/api/stac/v1
Documentation https://opendatadocs.meteoswiss.ch
MeteoSwiss OGD https://github.com/MeteoSwiss/opendata

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foehn-0.2.4.tar.gz (41.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

foehn-0.2.4-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file foehn-0.2.4.tar.gz.

File metadata

  • Download URL: foehn-0.2.4.tar.gz
  • Upload date:
  • Size: 41.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for foehn-0.2.4.tar.gz
Algorithm Hash digest
SHA256 b9120d712a0a910212e790cbce05ebd834e83536b007ec1b67f99a124082af13
MD5 fe47883ac65eef086c94b997424ffd19
BLAKE2b-256 c8392dda307547c9e84c15daf6a58188bd27feace148475f311f8234cc19d42e

See more details on using hashes here.

Provenance

The following attestation bundles were made for foehn-0.2.4.tar.gz:

Publisher: publish.yml on kayhendriksen/foehn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file foehn-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: foehn-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for foehn-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 243a86df1167822a0dc14f7a9fb58f3ea721370a9defef15290288de713b1400
MD5 4e68f5a43c9389e04dabc7f113dbb6e9
BLAKE2b-256 baee5aa8b9867669fea00f5a47979fb279a294dd47554bdea7c0fca3457d1cd0

See more details on using hashes here.

Provenance

The following attestation bundles were made for foehn-0.2.4-py3-none-any.whl:

Publisher: publish.yml on kayhendriksen/foehn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page