Skip to main content

Load publicly available electric load forecasting datasets as pandas DataFrames.

Project description

A simple logo

padelf

Easy Pandas DataFrame-Access to publicly available electric load forecasting datasets

PyPI version Python License: MIT

padelf provides a minimal Python API to download, cache, and standardize electric load forecasting datasets for research. Every dataset is returned as a pandas DataFrame with a UTC DateTimeIndex and a standardized consumption_kW column.

Installation

pip install padelf

Quick Start

import padelf

# See what's available
padelf.list_datasets()
# ['AEMO', 'ELD', 'ENTSO-E', 'GEFCOM12', 'IHPC', 'ISO-NE', 'NYISO', 'OPSD', 'Pecan-Street', 'RTE-France', 'VEA']

# Load a dataset — one line, sensible defaults
df = padelf.get_dataset("OPSD")
print(df.head())

Output:

                                                     consumption_kW  DE_solar_generation_actual  DE_wind_onshore_generation_actual
datetime
2015-01-01 00:00:00+00:00       41209.0                        NaN                            7568.0
2015-01-01 01:00:00+00:00       40029.0                        NaN                            7666.0
2015-01-01 02:00:00+00:00       38891.0                        NaN                            7637.0

What You Get

Every call to get_dataset() returns a DataFrame with:

  • DateTimeIndex — UTC timezone, equidistant at the dataset's native resolution
  • consumption_kW — Load/consumption column, unit-converted to kilowatts
  • Additional columns — As available in the original dataset (e.g., temperature, solar generation)

Optional Parameters

df = padelf.get_dataset(
        "OPSD",
        resolution="15min",       # Resample to 15-minute intervals
        consumption_unit="MW",    # Keep original MW units
        interpolate_limit="4h",   # Fill gaps up to 4 hours
        cache_dir="/tmp/padelf",  # Custom cache location
)

Available Datasets

Dataset Abbreviation Resolution Region Status
Open Power System Data OPSD 60 min Europe ✅ Ready
Individual Household Power Consumption IHPC 1 min France ✅ Ready
ElectricityLoadDiagrams20112014 ELD 15 min Portugal ✅ Ready
5359 industrial VEA load profiles VEA 15 min Germany ✅ Ready
GEFCom 2012 GEFCOM12 60 min US ⏩ Planned
ENTSO-E Transparency ENTSO-E 60 min Europe ⏩ Planned
ISO New England ISO-NE 60 min US ⏩ Planned
NYISO NYISO 5 min US ⏩ Planned
AEMO AEMO 60 min Australia ⏩ Planned
RTE France RTE-France 30 min France ⏩ Planned
Pecan Street Pecan Street 15 min US ⏩ Planned

a logo banner

Repository Structure

The project uses a src layout with per-dataset YAML configs:

.
├── README.md
├── pyproject.toml                # Build config (hatchling backend)
├── mkdocs.yml                    # Documentation site config
├── LICENSE
├── src/padelf/
│   ├── __init__.py               # Public API: list_datasets(), get_dataset()
│   ├── loader.py                 # Core loader logic: download, cache, parse, standardize
│   ├── utils.py                  # Unit conversion, gap interpolation, resampling
│   └── configs/
│       ├── _template.yaml        # Template for new loader configs
│       ├── OPSD.yaml             # Ready
│       ├── IHPC.yaml             # Ready
│       ├── ELD.yaml              # Ready
│       ├── VEA.yaml              # Ready
│       ├── GEFCOM12.yaml         # Ready (source URL intermittent)
│       ├── ENTSO-E.yaml          # API placeholder
│       ├── ISO-NE.yaml           # API placeholder
│       ├── NYISO.yaml            # API placeholder
│       ├── AEMO.yaml             # API placeholder
│       ├── RTE-France.yaml       # API placeholder
│       └── Pecan-Street.yaml     # API placeholder
├── docs/                         # mkdocs source files
│   ├── index.md
│   ├── getting-started.md
│   ├── api.md
│   └── datasets.md
└── tests/
    ├── test_loader.py
    ├── test_utils.py
    └── test_smoke.py

How It Works

The loader architecture follows a per-dataset config pattern. Each YAML file in src/padelf/configs/ defines a dataset's download URL, file format, column mappings, unit, and preprocessing parameters. When get_dataset() is called, loader.py reads the corresponding config, downloads the file (or uses a local cache), parses it, and applies standardization via utils.py: the load column is renamed to consumption_kW with automatic unit conversion (MW, kWh, MWh to kW), the index is converted to an equidistant UTC DateTimeIndex, gaps up to 2 hours are interpolated by default, and optional resampling is applied if requested. Datasets flagged with requires_api: true in their config raise NotImplementedError with a descriptive message -- these are placeholders for future implementation.

Adding a New Loader

See the Loader Developer Guide for details on the loader architecture and how to add new datasets.

API Placeholder Pattern

Six datasets (ENTSO-E, ISO-NE, NYISO, AEMO, RTE-France, Pecan-Street) are currently configured as API placeholders. Their YAML configs exist with requires_api: true, and calling get_dataset() on them raises NotImplementedError. To convert a placeholder into a working loader, remove the requires_api flag and either provide a direct download URL or implement API-specific download logic in loader.py. Note that ENTSO-E and ISO-NE have direct CSV downloads available and could be implemented as file-based loaders without API integration.

Original Catalog

A simle logo

To explore more datasets, check out the original PADELF Repository.

Citation

If this work has helped you with your scientific work, we would appreciate a proper mention. ❤️

@inproceedings{baur2024datasets,
    title     = {Publicly Available Datasets For Electric Load Forecasting -- An Overview},
    author    = {Baur, Lukas and Chandramouli, Vignesh and Sauer, Alexander},
    booktitle = {6th Conference on Production Systems and Logistics (CPSL 2024)},
    year      = {2024},
    doi       = {10.15488/17659}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

padelf-0.1.2.tar.gz (20.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

padelf-0.1.2-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file padelf-0.1.2.tar.gz.

File metadata

  • Download URL: padelf-0.1.2.tar.gz
  • Upload date:
  • Size: 20.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for padelf-0.1.2.tar.gz
Algorithm Hash digest
SHA256 f509cc86a2a44bac8ee7f0cd396ebbc7c0e822f9f7ba010dc315da1e14283df8
MD5 7b4f7db66d6015a46426a7f76a29a6e3
BLAKE2b-256 17483bee155b2e757b527a7c7f0403793b67654fb8a8add7da30da3c549d816e

See more details on using hashes here.

File details

Details for the file padelf-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: padelf-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for padelf-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 538e23e7a046e747bc66e5026493d25cbe838cbef4a2ffac68f689bd1b6a0def
MD5 3c49f19db3f677e33d8e0f80a0bace39
BLAKE2b-256 75c5d16adb847f0ae64fe57ea457e27b5ed83205660d28d2740bb5f84d66cfca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page