A downloader for KNMI weather datasets
Project description
KNMI Dataset Downloader
A Python package for easily downloading datasets from the KNMI (Royal Netherlands Meteorological Institute) Data Platform. This tool supports concurrent downloads and provides both a command-line interface and a Python API.
Background
This project was inspired by my experience working at Clairify [www.clairify.io], where I worked extensively with KNMI datasets. After leaving, I had more time to create this tool to address the need for a more streamlined download process. The goal was to simplify dataset acquisition for Python projects, making it easier for developers and data scientists to work with KNMI's valuable meteorological data.
Features
- Concurrent downloads for improved performance
- Progress bars for both overall and individual file downloads
- Support for date range filtering
- Skips already downloaded files
- Both CLI and Python API interfaces
- Detailed download statistics
- Anonymous API key support with automatic fetching
- Built with Kiota-generated API client for type-safe KNMI API interactions
- Request timeouts for improved reliability
Installation
You can install the package using pip:
pip install knmi-dataset-downloader
Prerequisites
- Python 3.7 or higher
- A KNMI Data Platform API key (optional - will use anonymous API key if not provided)
Usage
Command Line Interface
The simplest way to use the downloader is through the command line:
# Using your own API key
knmi-download --api-key YOUR_API_KEY --start-date 2024-01-01T00:00:00 --end-date 2024-01-31T23:59:59
# Using anonymous API key (automatically fetched)
knmi-download --start-date 2024-01-01 --end-date 2024-01-31
# Limit the number of files to download
knmi-download --start-date 2024-01-01 --end-date 2024-01-31 --limit 5
Available options:
Options:
-d, --dataset TEXT Name of the dataset to download (default: Actuele10mindataKNMIstations)
-v, --version TEXT Version of the dataset (default: 2)
-c, --concurrent INT Maximum number of concurrent downloads (default: 10)
-s, --start-date TEXT Start date in ISO 8601 format (e.g., 2024-01-01T00:00:00 or 2024-01-01)
Default is 1 hour and 30 minutes ago
-e, --end-date TEXT End date in ISO 8601 format (e.g., 2024-01-01T00:00:00 or 2024-01-01)
Default is now
--api-key TEXT KNMI API key (optional - will fetch anonymous API key if not provided)
-o, --output-dir PATH Output directory for downloaded files
--limit INT Maximum number of files to download (optional)
--help Show this message and exit
Python API
You can also use the package in your Python code:
from knmi_dataset_downloader import dataset
import asyncio
from datetime import datetime
async def main():
# Download files for a specific date range
stats = await dataset.download(
api_key="YOUR_API_KEY", # Optional - will use anonymous API key if not provided
dataset_name="Actuele10mindataKNMIstations", # Optional - uses default if not provided
version="2", # Optional - uses default if not provided
max_concurrent=10, # Optional - uses default if not provided
output_dir="path/to/output", # Optional - uses default if not provided
start_date=datetime(2024, 1, 1),
end_date=datetime(2024, 1, 31),
limit=5 # Optional - limit the number of files to download
)
# Access download statistics
print(f"Total files found: {stats.total_files}")
print(f"Files downloaded: {stats.downloaded_files}")
print(f"Files skipped: {stats.skipped_files}")
# Run the download
if __name__ == "__main__":
asyncio.run(main())
Download Statistics
After each download session, the tool provides detailed statistics including:
- Total number of files found
- Number of files already present (skipped)
- Number of files downloaded
- Number of failed downloads
- Total data downloaded
- List of any failed downloads
Configuration
By default, files are downloaded to a directory specified by DATASET_OUTPUT_DIR in your configuration. You can modify this by setting the appropriate environment variable or updating the config file.
Error Handling
- The downloader automatically skips existing files
- Partially downloaded files are removed in case of failures
- Failed downloads are logged and reported in the final statistics
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
Acknowledgments
- KNMI for providing the Data Platform API
- Built with Python's asyncio for efficient concurrent downloads
Support
If you encounter any problems or have suggestions, please open an issue on GitHub.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file knmi_dataset_downloader-1.8.0.tar.gz.
File metadata
- Download URL: knmi_dataset_downloader-1.8.0.tar.gz
- Upload date:
- Size: 28.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9abdfecd83717b441956efd096fda216166a4d68da071b9ec5c73ecd1c0a4ca
|
|
| MD5 |
b68424438d14bae8bb0f39e43f329530
|
|
| BLAKE2b-256 |
cbbfea573b8392318d5986b758796507a339785e2645dc1b1280a12280a5ad4d
|
Provenance
The following attestation bundles were made for knmi_dataset_downloader-1.8.0.tar.gz:
Publisher:
python-publish.yml on tiborrr/knmi-dataset-downloader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
knmi_dataset_downloader-1.8.0.tar.gz -
Subject digest:
e9abdfecd83717b441956efd096fda216166a4d68da071b9ec5c73ecd1c0a4ca - Sigstore transparency entry: 162364790
- Sigstore integration time:
-
Permalink:
tiborrr/knmi-dataset-downloader@053869c4b425470320cfdc53533df2ebfead80d7 -
Branch / Tag:
refs/tags/1.8.0 - Owner: https://github.com/tiborrr
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@053869c4b425470320cfdc53533df2ebfead80d7 -
Trigger Event:
release
-
Statement type:
File details
Details for the file knmi_dataset_downloader-1.8.0-py3-none-any.whl.
File metadata
- Download URL: knmi_dataset_downloader-1.8.0-py3-none-any.whl
- Upload date:
- Size: 36.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e2a7b37af306dfacec5094cd365e5add3e789347e87702876c60db01a22bfba
|
|
| MD5 |
3787fda21091a24a9e4c1e0f225a1e1f
|
|
| BLAKE2b-256 |
83ccce6ef0774f71a1035276482e114745b63b140b84268b0fbe9025ed495b64
|
Provenance
The following attestation bundles were made for knmi_dataset_downloader-1.8.0-py3-none-any.whl:
Publisher:
python-publish.yml on tiborrr/knmi-dataset-downloader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
knmi_dataset_downloader-1.8.0-py3-none-any.whl -
Subject digest:
7e2a7b37af306dfacec5094cd365e5add3e789347e87702876c60db01a22bfba - Sigstore transparency entry: 162364792
- Sigstore integration time:
-
Permalink:
tiborrr/knmi-dataset-downloader@053869c4b425470320cfdc53533df2ebfead80d7 -
Branch / Tag:
refs/tags/1.8.0 - Owner: https://github.com/tiborrr
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@053869c4b425470320cfdc53533df2ebfead80d7 -
Trigger Event:
release
-
Statement type: