Skip to main content

Download MeteoSwiss Open Government Data and convert to Parquet / Delta tables

Project description

foehn

MeteoSwiss Open Data → Parquet → Databricks Delta tables

PyPI Latest Release Python Versions MIT License Monthly Downloads


foehn downloads every MeteoSwiss OGD collection via the STAC API, converts CSV/TXT to Parquet with Polars, and optionally ingests everything into Databricks Unity Catalog Delta tables on a daily schedule.

Why foehn?

  • 20+ collections in one command — weather stations, radar, hail maps, forecasts, climate scenarios, and more
  • Significantly smaller on disk — columnar Parquet with Snappy compression vs. raw CSVs
  • Incremental by default — only re-downloads files that changed since your last run, tracked via _last_run.json
  • No Spark required locally — download + conversion uses Polars only; Spark is optional for Delta ingestion
  • Ships a Declarative Automation Bundle — ready-to-deploy daily job and historical backfill, no pipeline config needed

Quick start

pip install foehn
foehn

Recent data (Jan 1 → yesterday) is downloaded and converted to Parquet under ./data/meteoswiss/.


Collections

Key Description Format
smn Automatic weather stations (A1) CSV → Parquet
smn_precip Automatic precipitation stations (A2) CSV → Parquet
smn_tower Tower stations (A3) CSV → Parquet
nime Manual precipitation stations (A5) CSV → Parquet
tot Totaliser precipitation (A6) CSV → Parquet
obs Visual observations (A8) CSV → Parquet
pollen Pollen stations (A7) CSV → Parquet
phenology Phenological observations (A9) CSV → Parquet
nbcn Homogeneous climate stations (C1) CSV → Parquet
nbcn_precip Homogeneous precipitation (C2) CSV → Parquet
climate_normals Station normals 1961–1990 / 1991–2020 (C6) TXT → Parquet
climate_normals_* Spatial normals (C7) NetCDF / GeoTIFF
surface_derived_grid Spatial analyses — precip, temp, sunshine (C3/C4) NetCDF
satellite_derived_grid Spatial analyses — radiation, clouds (C5) NetCDF
climate_scenarios CH2025 local scenarios (C8) CSV → Parquet
climate_scenarios_grid CH2025 gridded scenarios (C9) NetCDF
hail_hazard_* Hail hazard maps NetCDF / ZIP
forecast_local Local point forecasts (E4) CSV → Parquet
forecast_icon_ch1/ch2 ICON-CH1/CH2-EPS (E2/E3) GRIB2 (opt-in)
radar_precip/hail Precipitation + hail radar (D1/D3) HDF5 (opt-in)

Installation

From PyPI:

pip install foehn

From source:

git clone https://github.com/kayhendriksen/foehn
cd foehn
pip install -e .

With Databricks extras (PySpark + Delta):

pip install "foehn[databricks]"

Requires Python ≥ 3.10.


Python API

Use foehn directly from notebooks or scripts:

import foehn

# List all available collections
foehn.list_collections()
# [{'category': 'CSV', 'key': 'smn', 'collection_id': 'ch.meteoschweiz.ogd-smn'}, ...]

# Download a single collection
foehn.fetch("smn", data_dir="./data/meteoswiss")

# Download with specific time slices
foehn.fetch("smn", data_types=["historical", "recent"])

# Convert downloaded CSVs to Parquet
foehn.convert("smn", data_dir="./data/meteoswiss")

CLI reference

foehn [options]

Time range — recent (Jan 1 this year → yesterday) is always included; flags extend it:

Flag Description
(none) Recent only — Jan 1 this year → yesterday, updated daily at 12 UTC
--historical Also fetch full archive — start of measurement → Dec 31 last year
--now Also fetch realtime slice — yesterday 12 UTC → now, 10-min updates
--all All three slices: historical + recent + now

Behaviour:

Flag Description
--full-refresh Ignore incremental tracking, re-download everything
--convert-only Convert existing CSVs to Parquet without downloading

Output:

Flag Description
--list List available collections and exit
--grids Also fetch GRIB2, radar HDF5, NetCDF, GeoTIFF (large)
--no-parquet Skip conversion, keep raw CSVs only
--data-dir PATH Output root (default: ./data/meteoswiss)

Parquet files land in <data-dir>/parquet/<collection>/.


Environment variables

Settings can also be configured via environment variables. CLI flags always take precedence.

Variable Equivalent Description
FOEHN_DATA_DIR --data-dir Root data directory
FOEHN_FULL_REFRESH --full-refresh Set to 1, true, or yes to ignore incremental tracking

Databricks pipeline

The recommended setup uses Declarative Automation Bundles.

1. Set variables:

export BUNDLE_VAR_host=https://adb-xxx.azuredatabricks.net
export BUNDLE_VAR_alert_email=you@example.com

2. Deploy:

pip install databricks-cli
databricks bundle validate
databricks bundle deploy -t prod

This deploys two jobs:

  • foehn_daily — runs at 13:30 UTC every day; downloads recent data and refreshes Delta tables
  • foehn_historical — paused by default; trigger manually for first run or on Jan 1 for the annual archive slice

Data sources

STAC API https://data.geo.admin.ch/api/stac/v1
Documentation https://opendatadocs.meteoswiss.ch
MeteoSwiss OGD https://github.com/MeteoSwiss/opendata

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foehn-0.2.1.tar.gz (41.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

foehn-0.2.1-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file foehn-0.2.1.tar.gz.

File metadata

  • Download URL: foehn-0.2.1.tar.gz
  • Upload date:
  • Size: 41.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for foehn-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f067a5b25f1e489f588a1bed48b6aab6e85deb6435e417543af487ebe5a3e081
MD5 a6f002405fc51c15d4e5c4732c943c3f
BLAKE2b-256 85a95fd46289c29f529d770cbc6b54a99dc58a9c90d707bf4ac3dcfc4704d9dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for foehn-0.2.1.tar.gz:

Publisher: publish.yml on kayhendriksen/foehn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file foehn-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: foehn-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for foehn-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d1379c91c52eb2c6f65fce8accfcd7a86de8fa1cc69b7f675ab090fc849a8fec
MD5 193a12b9ccbc9aed92a3f9ffc0ddb67e
BLAKE2b-256 3daab8801db646c858295c85b2158d3dab242172448c9819c62412f8a14de62c

See more details on using hashes here.

Provenance

The following attestation bundles were made for foehn-0.2.1-py3-none-any.whl:

Publisher: publish.yml on kayhendriksen/foehn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page