Skip to main content

This project automates the fetching and extraction of weather data from multiple sources — such as MSWX, DWD HYRAS, ERA5-Land, NASA-NEX-GDDP, and more — for a given location and time range.

Project description

climdata

image image

This project automates the fetching and extraction of weather data from multiple sources — such as MSWX, DWD HYRAS, ERA5-Land, NASA-NEX-GDDP, and more — for a given location and time range.

📦 Data Sources

This project utilizes climate and weather datasets from a variety of data sources:

  • DWD Station Data
    Retrieved using the DWD API. Provides high-resolution observational data from Germany's national meteorological service.

  • MSWX (Multi-Source Weather)
    Accessed via GloH2O's Google Drive. Combines multiple satellite and reanalysis datasets for global gridded weather variables.

  • DWD HYRAS
    Downloaded from the DWD Open Data FTP Server. Offers gridded observational data for Central Europe, useful for hydrological applications.

  • ERA5, ERA5-Land
    Accessed through the Google Earth Engine. Provides reanalysis datasets from ECMWF with high temporal and spatial resolution.

  • NASA NEX-GDDP
    Also retrieved via Earth Engine. Downscaled CMIP5/CMIP6 climate projections developed by NASA for local-scale impact assessment.

  • CMIP6
    Obtained using ESGPull from the ESGF data nodes. Includes multi-model climate simulations following various future scenarios.

It supports: ✅ Automatic file download (e.g., from Google Drive or online servers)
✅ Flexible configuration via config.yaml
✅ Time series extraction for a user-specified latitude/longitude
✅ Batch processing for many locations from a CSV file

🚀 How to Run and Explore Configurations

✅ Run a download job with custom overrides

You can run the data download script and override any configuration value directly in the command line using Hydra.

For example, to download ERA5-Land data for January 1–4, 2020, run:

python download_location.py dataset='era5-land' \
  time_range.start_date='2020-01-01' \
  time_range.end_date='2020-01-04' \
  location.lat=52.5200 \
  location.lon=13.4050

For downloading multiple locations from a csv file locations.csv, run:

python download_csv.py dataset='era5-land' \
  time_range.start_date='2020-01-01' \
  time_range.end_date='2020-01-04' \

an example locations.csv can be

lat,lon,city
52.5200,13.4050,berlin
48.1351,11.5820,munich
53.5511,9.9937,hamburg

What this does:

  • dataset='era5-land' tells the script which dataset to use.
  • time_range.start_date and time_range.end_date override the default dates in your YAML config.
  • All other settings use your existing config.yaml in the conf folder.

✅ List all available datasets defined in your configuration

To see what datasets are available (without running the downloader), you can dump the resolved configuration and filter it using yq.

Run:

python download_location.py --cfg job | yq '.mappings | keys'

What this does:

  • --cfg job tells Hydra to output the final resolved configuration and exit.
  • | yq '.mappings | keys' filters the output to show only the dataset names defined under the mappings section.

⚡️ Tip

  • Make sure yq is installed:

    brew install yq   # macOS
    # OR
    pip install yq
    
  • To see available variables for a specific dataset (for example mswx), run:

    python download_location.py --cfg job | yq '.mappings.mswx.variables | keys'
    


⚙️ Key Features

  • Supports multiple weather data providers
  • Uses xarray for robust gridded data extraction
  • Handles curvilinear and rectilinear grids
  • Uses a Google Drive Service Account for secure downloads
  • Easily reproducible runs using Hydra

📡 Google Drive API Setup

This project uses the Google Drive API with a Service Account to securely download weather data files from a shared Google Drive folder.

Follow these steps to set it up correctly:


✅ 1. Create a Google Cloud Project

  • Go to Google Cloud Console.
  • Click “Select Project”“New Project”.
  • Enter a project name (e.g. WeatherDataDownloader).
  • Click “Create”.

✅ 2. Enable the Google Drive API

  • In the left sidebar, go to APIs & Services → Library.
  • Search for “Google Drive API”.
  • Click it, then click “Enable”.

✅ 3. Create a Service Account

  • Go to IAM & Admin → Service Accounts.
  • Click “Create Service Account”.
  • Enter a name (e.g. weather-downloader-sa).
  • Click “Create and Continue”. You can skip assigning roles for read-only Drive access.
  • Click “Done” to finish.

✅ 4. Create and Download a JSON Key

  • After creating the Service Account, click on its email address to open its details.
  • Go to the “Keys” tab.
  • Click “Add Key” → “Create new key” → choose JSON → click “Create”.
  • A .json key file will download automatically. Store it securely!

✅ 5. Store the JSON Key Securely

  • Place the downloaded .json key in the conf folder with the name service.json.

Setup Instructions fro ERA5 api

1. CDS API Key Setup

  1. Create a free account on the Copernicus Climate Data Store

  2. Once logged in, go to your user profile

  3. Click on the "Show API key" button

  4. Create the file ~/.cdsapirc with the following content:

    url: https://cds.climate.copernicus.eu/api/v2
    key: <your-api-key-here>
    
  5. Make sure the file has the correct permissions: chmod 600 ~/.cdsapirc

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

climdata-0.0.2.tar.gz (372.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

climdata-0.0.2-py2.py3-none-any.whl (22.0 kB view details)

Uploaded Python 2Python 3

File details

Details for the file climdata-0.0.2.tar.gz.

File metadata

  • Download URL: climdata-0.0.2.tar.gz
  • Upload date:
  • Size: 372.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.17

File hashes

Hashes for climdata-0.0.2.tar.gz
Algorithm Hash digest
SHA256 a691ce4ef3245009554580a1ce6c031bdbc6a39681919fdf111549d75d9f9855
MD5 9a39a2a432c5d4576a589eeb635bea12
BLAKE2b-256 93a0acb7ca5b38d4a45218fcdba2e01e012545c20608ab7ddc0efa1fa08fc27c

See more details on using hashes here.

File details

Details for the file climdata-0.0.2-py2.py3-none-any.whl.

File metadata

  • Download URL: climdata-0.0.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 22.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.17

File hashes

Hashes for climdata-0.0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e000e755e2d2768ce1a8d06c6ee0f89a150c4398cc594697ccb627017f0ed0c7
MD5 2eb0db258040187b11d4b9e331994673
BLAKE2b-256 91b668baf257fd351e7e47b7a9592951121540e2773aaab54b06b8848bf3cb5a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page