Skip to main content

A downloader for KNMI weather datasets

Project description

KNMI Dataset Downloader

A Python package for easily downloading datasets from the KNMI (Royal Netherlands Meteorological Institute) Data Platform. This tool supports concurrent downloads and provides both a command-line interface and a Python API.

Background

This project was inspired by my experience working at Clairify [www.clairify.io], where I worked extensively with KNMI datasets. After leaving, I had more time to create this tool to address the need for a more streamlined download process. The goal was to simplify dataset acquisition for Python projects, making it easier for developers and data scientists to work with KNMI's valuable meteorological data.

Features

  • Concurrent downloads for improved performance
  • Progress bars for both overall and individual file downloads
  • Support for date range filtering
  • Skips already downloaded files
  • Both CLI and Python API interfaces
  • Detailed download statistics
  • Anonymous API key support with automatic fetching
  • Built with Kiota-generated API client for type-safe KNMI API interactions
  • Request timeouts for improved reliability

Installation

You can install the package using pip:

pip install knmi-dataset-downloader

Prerequisites

  • Python 3.7 or higher
  • A KNMI Data Platform API key (optional - will use anonymous API key if not provided)

Usage

Command Line Interface

The simplest way to use the downloader is through the command line:

# Using your own API key
knmi-download --api-key YOUR_API_KEY --start-date 2024-01-01T00:00:00 --end-date 2024-01-31T23:59:59

# Using anonymous API key (automatically fetched)
knmi-download --start-date 2024-01-01 --end-date 2024-01-31

# Limit the number of files to download
knmi-download --start-date 2024-01-01 --end-date 2024-01-31 --limit 5

Available options:

Options:
  -d, --dataset TEXT     Name of the dataset to download (default: Actuele10mindataKNMIstations)
  -v, --version TEXT     Version of the dataset (default: 2)
  -c, --concurrent INT   Maximum number of concurrent downloads (default: 10)
  -s, --start-date TEXT  Start date in ISO 8601 format (e.g., 2024-01-01T00:00:00 or 2024-01-01)
                        Default is 1 hour and 30 minutes ago
  -e, --end-date TEXT    End date in ISO 8601 format (e.g., 2024-01-01T00:00:00 or 2024-01-01)
                        Default is now
  --api-key TEXT         KNMI API key (optional - will fetch anonymous API key if not provided)
  -o, --output-dir PATH  Output directory for downloaded files
  --limit INT           Maximum number of files to download (optional)
  --help                 Show this message and exit

Python API

You can also use the package in your Python code:

from knmi_dataset_downloader import dataset
import asyncio
from datetime import datetime

async def main():
    # Download files for a specific date range
    stats = await dataset.download(
        api_key="YOUR_API_KEY",  # Optional - will use anonymous API key if not provided
        dataset_name="Actuele10mindataKNMIstations",  # Optional - uses default if not provided
        version="2",  # Optional - uses default if not provided
        max_concurrent=10,  # Optional - uses default if not provided
        output_dir="path/to/output",  # Optional - uses default if not provided
        start_date=datetime(2024, 1, 1),
        end_date=datetime(2024, 1, 31),
        limit=5  # Optional - limit the number of files to download
    )
    
    # Access download statistics
    print(f"Total files found: {stats.total_files}")
    print(f"Files downloaded: {stats.downloaded_files}")
    print(f"Files skipped: {stats.skipped_files}")

# Run the download
if __name__ == "__main__":
    asyncio.run(main())

Download Statistics

After each download session, the tool provides detailed statistics including:

  • Total number of files found
  • Number of files already present (skipped)
  • Number of files downloaded
  • Number of failed downloads
  • Total data downloaded
  • List of any failed downloads

Configuration

By default, files are downloaded to a directory specified by DATASET_OUTPUT_DIR in your configuration. You can modify this by setting the appropriate environment variable or updating the config file.

Error Handling

  • The downloader automatically skips existing files
  • Partially downloaded files are removed in case of failures
  • Failed downloads are logged and reported in the final statistics

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Acknowledgments

  • KNMI for providing the Data Platform API
  • Built with Python's asyncio for efficient concurrent downloads

Support

If you encounter any problems or have suggestions, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

knmi_dataset_downloader-1.8.0.tar.gz (28.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

knmi_dataset_downloader-1.8.0-py3-none-any.whl (36.7 kB view details)

Uploaded Python 3

File details

Details for the file knmi_dataset_downloader-1.8.0.tar.gz.

File metadata

  • Download URL: knmi_dataset_downloader-1.8.0.tar.gz
  • Upload date:
  • Size: 28.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for knmi_dataset_downloader-1.8.0.tar.gz
Algorithm Hash digest
SHA256 e9abdfecd83717b441956efd096fda216166a4d68da071b9ec5c73ecd1c0a4ca
MD5 b68424438d14bae8bb0f39e43f329530
BLAKE2b-256 cbbfea573b8392318d5986b758796507a339785e2645dc1b1280a12280a5ad4d

See more details on using hashes here.

Provenance

The following attestation bundles were made for knmi_dataset_downloader-1.8.0.tar.gz:

Publisher: python-publish.yml on tiborrr/knmi-dataset-downloader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file knmi_dataset_downloader-1.8.0-py3-none-any.whl.

File metadata

File hashes

Hashes for knmi_dataset_downloader-1.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7e2a7b37af306dfacec5094cd365e5add3e789347e87702876c60db01a22bfba
MD5 3787fda21091a24a9e4c1e0f225a1e1f
BLAKE2b-256 83ccce6ef0774f71a1035276482e114745b63b140b84268b0fbe9025ed495b64

See more details on using hashes here.

Provenance

The following attestation bundles were made for knmi_dataset_downloader-1.8.0-py3-none-any.whl:

Publisher: python-publish.yml on tiborrr/knmi-dataset-downloader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page