Skip to main content

This project automates the fetching and extraction of weather data from multiple sources — such as MSWX, DWD HYRAS, ERA5-Land, NASA-NEX-GDDP, and more — for a given location and time range.

Project description

climdata

image image

This project automates the fetching and extraction of weather data from multiple sources — such as MSWX, DWD HYRAS, ERA5-Land, NASA-NEX-GDDP, and more — for a given location and time range.

📦 Data Sources

This project utilizes climate and weather datasets from a variety of data sources:

  • DWD Station Data
    Retrieved using the DWD API. Provides high-resolution observational data from Germany's national meteorological service.

  • MSWX (Multi-Source Weather)
    Accessed via GloH2O's Google Drive. Combines multiple satellite and reanalysis datasets for global gridded weather variables.

  • DWD HYRAS
    Downloaded from the DWD Open Data FTP Server. Offers gridded observational data for Central Europe, useful for hydrological applications.

  • ERA5, ERA5-Land
    Accessed through the Google Earth Engine. Provides reanalysis datasets from ECMWF with high temporal and spatial resolution.

  • NASA NEX-GDDP
    Also retrieved via Earth Engine. Downscaled CMIP5/CMIP6 climate projections developed by NASA for local-scale impact assessment.

  • CMIP6
    Obtained using ESGPull from the ESGF data nodes. Includes multi-model climate simulations following various future scenarios.

It supports: ✅ Automatic file download (e.g., from Google Drive or online servers)
✅ Flexible configuration via config.yaml
✅ Time series extraction for a user-specified latitude/longitude
✅ Batch processing for many locations from a CSV file

🚀 How to Run and Explore Configurations

✅ Run a download job with custom overrides

You can run the data download script and override any configuration value directly in the command line using Hydra.

For example, to download ERA5-Land data for January 1–4, 2020, run:

python download_location.py dataset='era5-land' \
  time_range.start_date='2020-01-01' \
  time_range.end_date='2020-01-04' \
  location.lat=52.5200 \
  location.lon=13.4050

For downloading multiple locations from a csv file locations.csv, run:

python download_csv.py dataset='era5-land' \
  time_range.start_date='2020-01-01' \
  time_range.end_date='2020-01-04' \

an example locations.csv can be

lat,lon,city
52.5200,13.4050,berlin
48.1351,11.5820,munich
53.5511,9.9937,hamburg

What this does:

  • dataset='era5-land' tells the script which dataset to use.
  • time_range.start_date and time_range.end_date override the default dates in your YAML config.
  • All other settings use your existing config.yaml in the conf folder.

✅ List all available datasets defined in your configuration

To see what datasets are available (without running the downloader), you can dump the resolved configuration and filter it using yq.

Run:

python download_location.py --cfg job | yq '.mappings | keys'

What this does:

  • --cfg job tells Hydra to output the final resolved configuration and exit.
  • | yq '.mappings | keys' filters the output to show only the dataset names defined under the mappings section.

⚡️ Tip

  • Make sure yq is installed:

    brew install yq   # macOS
    # OR
    pip install yq
    
  • To see available variables for a specific dataset (for example mswx), run:

    python download_location.py --cfg job | yq '.mappings.mswx.variables | keys'
    


⚙️ Key Features

  • Supports multiple weather data providers
  • Uses xarray for robust gridded data extraction
  • Handles curvilinear and rectilinear grids
  • Uses a Google Drive Service Account for secure downloads
  • Easily reproducible runs using Hydra

📡 Google Drive API Setup

This project uses the Google Drive API with a Service Account to securely download weather data files from a shared Google Drive folder.

Follow these steps to set it up correctly:


✅ 1. Create a Google Cloud Project

  • Go to Google Cloud Console.
  • Click “Select Project”“New Project”.
  • Enter a project name (e.g. WeatherDataDownloader).
  • Click “Create”.

✅ 2. Enable the Google Drive API

  • In the left sidebar, go to APIs & Services → Library.
  • Search for “Google Drive API”.
  • Click it, then click “Enable”.

✅ 3. Create a Service Account

  • Go to IAM & Admin → Service Accounts.
  • Click “Create Service Account”.
  • Enter a name (e.g. weather-downloader-sa).
  • Click “Create and Continue”. You can skip assigning roles for read-only Drive access.
  • Click “Done” to finish.

✅ 4. Create and Download a JSON Key

  • After creating the Service Account, click on its email address to open its details.
  • Go to the “Keys” tab.
  • Click “Add Key” → “Create new key” → choose JSON → click “Create”.
  • A .json key file will download automatically. Store it securely!

✅ 5. Store the JSON Key Securely

  • Place the downloaded .json key in the conf folder with the name service.json.

Setup Instructions fro ERA5 api

1. CDS API Key Setup

  1. Create a free account on the Copernicus Climate Data Store

  2. Once logged in, go to your user profile

  3. Click on the "Show API key" button

  4. Create the file ~/.cdsapirc with the following content:

    url: https://cds.climate.copernicus.eu/api/v2
    key: <your-api-key-here>
    
  5. Make sure the file has the correct permissions: chmod 600 ~/.cdsapirc

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

climdata-0.1.3.tar.gz (94.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

climdata-0.1.3-py2.py3-none-any.whl (26.1 kB view details)

Uploaded Python 2Python 3

File details

Details for the file climdata-0.1.3.tar.gz.

File metadata

  • Download URL: climdata-0.1.3.tar.gz
  • Upload date:
  • Size: 94.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for climdata-0.1.3.tar.gz
Algorithm Hash digest
SHA256 59e0e1f2c20ae5f2df79144f419331e982085e6b40ebc1539f73cf7598f93cac
MD5 4e9b343c42a7bea253d6e7a35dc44f78
BLAKE2b-256 06e1c529641377e17ac53e3391d27c62871b9ed90b7b133ad0323d8d00e788ba

See more details on using hashes here.

File details

Details for the file climdata-0.1.3-py2.py3-none-any.whl.

File metadata

  • Download URL: climdata-0.1.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 26.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for climdata-0.1.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 61c6793f1f56150598efd42042784ea71d0ca6178217ed4cb151d25a09783401
MD5 b3a7d1d562a96d1e05eefd801e6dcb6a
BLAKE2b-256 ff334978c4c2e36ad8ddfc5efdb9d669617ecf7c1d982065b61c61ae4193fa68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page