A downloader for KNMI weather datasets
Project description
KNMI Dataset Downloader
A Python package for easily downloading datasets from the KNMI (Royal Netherlands Meteorological Institute) Data Platform. This tool supports concurrent downloads and provides both a command-line interface and a Python API.
Background
This project was inspired by my experience working at Clairify (https://www.clairify.io), where I worked extensively with KNMI datasets. After leaving, I had more time to create this tool to address the need for a more streamlined download process. The goal was to simplify dataset acquisition for Python projects, making it easier for developers and data scientists to work with KNMI's valuable meteorological data.
Features
- Concurrent downloads for improved performance
- Progress bars for overall and per-file downloads
- Date range filtering (CLI and API translate times to UTC for the KNMI list-files API)
- Skips files that are already present on disk
- CLI and Python
asyncAPI - Download statistics (
DownloadStats) - Anonymous API key: optional automatic fetch from the KNMI developer portal (HTTP client timeout on that request)
- Kiota-generated client for the KNMI Open Data API
Installation
From PyPI:
pip install knmi-dataset-downloader
From source (dependencies are declared in pyproject.toml; lockfile is uv.lock if you use uv):
git clone https://github.com/tiborrr/knmi-dataset-downloader.git
cd knmi-dataset-downloader
uv sync # recommended: creates .venv and installs project + dev tools
# or: pip install .
Prerequisites
- Python 3.14+ (see
requires-pythoninpyproject.toml) - KNMI Data Platform API key optional — if you omit
--api-key/api_key, an anonymous key is fetched from the developer portal
Usage
Command line
# With your own API key
knmi-download --api-key YOUR_API_KEY --start-date 2024-01-01T00:00:00 --end-date 2024-01-31T23:59:59
# Anonymous key (fetched for you)
knmi-download --start-date 2024-01-01 --end-date 2024-01-31
# Cap how many files to download
knmi-download --start-date 2024-01-01 --end-date 2024-01-31 --limit 5
If you omit --start-date / --end-date, the CLI defaults to the last 1 hour 30 minutes in UTC through now (UTC).
Use -o / --output-dir to choose where files go (default: ./datasets relative to the current working directory).
Typical options (see knmi-download --help for the full list):
| Option | Description |
|---|---|
-d, --dataset |
Dataset name (default: Actuele10mindataKNMIstations) |
-v, --version |
Dataset version (default: 2) |
-c, --concurrent |
Max concurrent downloads (default: 10) |
-s, --start-date |
ISO 8601 start (default: ~1h30 ago UTC) |
-e, --end-date |
ISO 8601 end (default: now UTC) |
--api-key |
KNMI API key (optional) |
-o, --output-dir |
Output directory (default: ./datasets) |
--limit |
Maximum number of files |
Python API
import asyncio
from datetime import datetime
from knmi_dataset_downloader import download, DownloadStats
async def main() -> None:
stats: DownloadStats = await download(
api_key="YOUR_API_KEY", # Optional; anonymous key is used if omitted / None
dataset_name="Actuele10mindataKNMIstations",
version="2",
max_concurrent=10,
output_dir="path/to/output", # default: ./datasets
start_date=datetime(2024, 1, 1, 0, 0, 0),
end_date=datetime(2024, 1, 31, 23, 59, 59),
limit=5,
)
print(f"Total files found: {stats.total_files}")
print(f"Files downloaded: {stats.downloaded_files}")
print(f"Files skipped: {stats.skipped_files}")
if __name__ == "__main__":
asyncio.run(main())
Public re-exports also include DEFAULT_DATASET_NAME, DEFAULT_DATASET_VERSION, DEFAULT_MAX_CONCURRENT, and DEFAULT_OUTPUT_DIR from knmi_dataset_downloader.
Download statistics
Each run reports:
- Total files matching the query
- Skipped (already on disk)
- Downloaded
- Failures (with names in
stats.failed_files) - Total bytes downloaded
Configuration
There is no DATASET_OUTPUT_DIR environment variable in this package. Outputs go to:
- Default:
./datasets(seeDEFAULT_OUTPUT_DIRinknmi_dataset_downloader.defaults), or - CLI:
--output-dir/-o, or - API:
output_dir=ondownload().
Error handling
- Existing files are skipped (not re-downloaded by default).
- Partial files are removed if a download fails.
- Failures are logged and listed on
DownloadStats.failed_files.
Heavy use of the anonymous Open Data API can result in HTTP 429; KNMI may require a cooldown (on the order of an hour) before retrying.
Developing
- Tests:
pytestwithpytest-asyncio(uv run pytestorpytest testswith dev deps installed). - Lint / types:
uv run ruff check src tests,uv run basedpyright src tests(seepyproject.toml). - Integration tests call the real KNMI API; they may skip on 429.
KNMI Open Data API client (Kiota)
The HTTP client under src/knmi_dataset_downloader/knmi_dataset_api is generated with Kiota from the KNMI OpenAPI description. Workspace metadata lives in .kiota/workspace.json (and .kiota/apimanifest.json).
To explore the API surface and OpenAPI in the editor, install the Kiota extension for Visual Studio Code, open this repository, and use the extension’s explorer (e.g. browse the description and see how it maps to the generated request builders). Regenerating the client is optional; if you need to, use the Kiota CLI or the extension’s generate flow with that workspace configuration.
Contributing
Contributions are welcome. Please open a Pull Request; for larger changes, open an issue first.
License
This project is licensed under the GNU General Public License v3.0 or later — see the LICENSE file.
Acknowledgments
- KNMI for the Data Platform API
- Async I/O via
asyncioandhttpx
Support
Problems or suggestions: open an issue.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file knmi_dataset_downloader-1.14.0.tar.gz.
File metadata
- Download URL: knmi_dataset_downloader-1.14.0.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
664602c1150c0ed02cde4e55a56c60093fc439bd999fbb0c96c606d55227b9a2
|
|
| MD5 |
d61fde45fb17b66c29f5c2f13ea71c73
|
|
| BLAKE2b-256 |
dc02ad99f5252498e0420b0ac75ea9d17bc5874ce86c7a3765231b690c0465e9
|
Provenance
The following attestation bundles were made for knmi_dataset_downloader-1.14.0.tar.gz:
Publisher:
python-publish.yml on tiborrr/knmi-dataset-downloader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
knmi_dataset_downloader-1.14.0.tar.gz -
Subject digest:
664602c1150c0ed02cde4e55a56c60093fc439bd999fbb0c96c606d55227b9a2 - Sigstore transparency entry: 1310184112
- Sigstore integration time:
-
Permalink:
tiborrr/knmi-dataset-downloader@ab93c8652d59c91315006773b4a3cb13025d180d -
Branch / Tag:
refs/tags/1.14.0 - Owner: https://github.com/tiborrr
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@ab93c8652d59c91315006773b4a3cb13025d180d -
Trigger Event:
push
-
Statement type:
File details
Details for the file knmi_dataset_downloader-1.14.0-py3-none-any.whl.
File metadata
- Download URL: knmi_dataset_downloader-1.14.0-py3-none-any.whl
- Upload date:
- Size: 27.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
894366b9560c0055473b37cc9c7db375e296f5f8a6c9b6b13b39de187fec5bb3
|
|
| MD5 |
11a5757c43a6f16a7690b5d7693c1691
|
|
| BLAKE2b-256 |
c83cfab2157d3754fd9c3069a55e342dcb2dcbf9aa6bfc816b75b2264cda9f70
|
Provenance
The following attestation bundles were made for knmi_dataset_downloader-1.14.0-py3-none-any.whl:
Publisher:
python-publish.yml on tiborrr/knmi-dataset-downloader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
knmi_dataset_downloader-1.14.0-py3-none-any.whl -
Subject digest:
894366b9560c0055473b37cc9c7db375e296f5f8a6c9b6b13b39de187fec5bb3 - Sigstore transparency entry: 1310184219
- Sigstore integration time:
-
Permalink:
tiborrr/knmi-dataset-downloader@ab93c8652d59c91315006773b4a3cb13025d180d -
Branch / Tag:
refs/tags/1.14.0 - Owner: https://github.com/tiborrr
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@ab93c8652d59c91315006773b4a3cb13025d180d -
Trigger Event:
push
-
Statement type: