Skip to main content

Load publicly available electric load forecasting datasets as pandas DataFrames.

Project description

A simple logo

padelf

Easy Pandas DataFrame-Access to publicly available electric load forecasting datasets

PyPI version Python License: MIT

padelf provides a minimal Python API to download, cache, and standardize electric load forecasting datasets for research. Every dataset is returned as a pandas DataFrame with a UTC DateTimeIndex and a standardized consumption_kW column.

Installation

pip install padelf

Quick Start

import padelf

# See what's available
padelf.list_datasets()
# ['AEMO', 'ELD', 'ENTSO-E', 'GEFCOM12', 'IHPC', 'ISO-NE', 'NYISO', 'OPSD', 'Pecan-Street', 'RTE-France', 'VEA']

# Load a dataset — one line, sensible defaults
df = padelf.get_dataset("OPSD")
print(df.head())

Output:

                                                     consumption_kW  DE_solar_generation_actual  DE_wind_onshore_generation_actual
datetime
2015-01-01 00:00:00+00:00       41209.0                        NaN                            7568.0
2015-01-01 01:00:00+00:00       40029.0                        NaN                            7666.0
2015-01-01 02:00:00+00:00       38891.0                        NaN                            7637.0

What You Get

Every call to get_dataset() returns a DataFrame with:

  • DateTimeIndex — UTC timezone, equidistant at the dataset's native resolution
  • consumption_kW — Load/consumption column, unit-converted to kilowatts
  • Additional columns — As available in the original dataset (e.g., temperature, solar generation)

Optional Parameters

df = padelf.get_dataset(
        "OPSD",
        resolution="15min",       # Resample to 15-minute intervals
        consumption_unit="MW",    # Keep original MW units
        interpolate_limit="4h",   # Fill gaps up to 4 hours
        cache_dir="/tmp/padelf",  # Custom cache location
)

Available Datasets

Dataset Abbreviation Resolution Region Status
Open Power System Data OPSD 60 min Europe Ready
Individual Household Power Consumption IHPC 1 min France Ready
ElectricityLoadDiagrams20112014 ELD 15 min Portugal Ready
5359 industrial VEA load profiles VEA 15 min Germany Ready
GEFCom 2012 GEFCOM12 60 min US Planned
ENTSO-E Transparency ENTSO-E 60 min Europe Planned
ISO New England ISO-NE 60 min US Planned
NYISO NYISO 5 min US Planned
AEMO AEMO 60 min Australia Planned
RTE France RTE-France 30 min France Planned
Pecan Street Pecan Street 15 min US Planned

a logo banner

Repository Structure

The project uses a src layout with per-dataset YAML configs:

.
├── README.md
├── pyproject.toml                # Build config (hatchling backend)
├── mkdocs.yml                    # Documentation site config
├── LICENSE
├── src/padelf/
│   ├── __init__.py               # Public API: list_datasets(), get_dataset()
│   ├── loader.py                 # Core loader logic: download, cache, parse, standardize
│   ├── utils.py                  # Unit conversion, gap interpolation, resampling
│   └── configs/
│       ├── _template.yaml        # Template for new loader configs
│       ├── OPSD.yaml             # Ready
│       ├── IHPC.yaml             # Ready
│       ├── ELD.yaml              # Ready
│       ├── VEA.yaml              # Ready
│       ├── GEFCOM12.yaml         # Ready (source URL intermittent)
│       ├── ENTSO-E.yaml          # API placeholder
│       ├── ISO-NE.yaml           # API placeholder
│       ├── NYISO.yaml            # API placeholder
│       ├── AEMO.yaml             # API placeholder
│       ├── RTE-France.yaml       # API placeholder
│       └── Pecan-Street.yaml     # API placeholder
├── docs/                         # mkdocs source files
│   ├── index.md
│   ├── getting-started.md
│   ├── api.md
│   └── datasets.md
└── tests/
    ├── test_loader.py
    ├── test_utils.py
    └── test_smoke.py

How It Works

The loader architecture follows a per-dataset config pattern. Each YAML file in src/padelf/configs/ defines a dataset's download URL, file format, column mappings, unit, and preprocessing parameters. When get_dataset() is called, loader.py reads the corresponding config, downloads the file (or uses a local cache), parses it, and applies standardization via utils.py: the load column is renamed to consumption_kW with automatic unit conversion (MW, kWh, MWh to kW), the index is converted to an equidistant UTC DateTimeIndex, gaps up to 2 hours are interpolated by default, and optional resampling is applied if requested. Datasets flagged with requires_api: true in their config raise NotImplementedError with a descriptive message -- these are placeholders for future implementation.

Adding a New Loader

See the Loader Developer Guide for details on the loader architecture and how to add new datasets.

API Placeholder Pattern

Six datasets (ENTSO-E, ISO-NE, NYISO, AEMO, RTE-France, Pecan-Street) are currently configured as API placeholders. Their YAML configs exist with requires_api: true, and calling get_dataset() on them raises NotImplementedError. To convert a placeholder into a working loader, remove the requires_api flag and either provide a direct download URL or implement API-specific download logic in loader.py. Note that ENTSO-E and ISO-NE have direct CSV downloads available and could be implemented as file-based loaders without API integration.

Original Catalog

A simle logo

To explore more datasets, check out the original PADELF Repository.

Citation

If this work has helped you with your scientific work, we would appreciate a proper mention. ❤️

@inproceedings{baur2024datasets,
    title     = {Publicly Available Datasets For Electric Load Forecasting -- An Overview},
    author    = {Baur, Lukas and Chandramouli, Vignesh and Sauer, Alexander},
    booktitle = {6th Conference on Production Systems and Logistics (CPSL 2024)},
    year      = {2024},
    doi       = {10.15488/17659}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

padelf-0.1.1.tar.gz (40.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

padelf-0.1.1-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file padelf-0.1.1.tar.gz.

File metadata

  • Download URL: padelf-0.1.1.tar.gz
  • Upload date:
  • Size: 40.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for padelf-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c96f531b3000f5421832989058e88a07a78b48a3959d28869694eeb681494ade
MD5 363a7c4eb34b50738577770ac8326ef6
BLAKE2b-256 1aded881dc8ad96317263db0e0e0e8f3b0aefd54742cb337b62cc1657038ca34

See more details on using hashes here.

File details

Details for the file padelf-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: padelf-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for padelf-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 82b1625ed2cdb752cbc392201d130f978070f839e08aa3001a1bcc821ae3b986
MD5 a62c4842ec5eab5d68dbd76ce1feca94
BLAKE2b-256 ebebdd636b33ad7f0603fa853ba37007f0d549c4746be1f669100c3a3920ec90

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page