TopoStateGrid is a physically informed graph construction method that converts power-grid topology, component attributes, and operating-state variables into machine-learning-ready graph datasets.
Project description
TopoStateGrid
TopoStateGrid is a physically informed graph construction method that converts power-grid topology, component attributes, and operating-state variables into machine-learning-ready graph datasets.
The Python package import name is topostategrid.
Scope
TopoStateGrid focuses on physically grounded, state-dependent, and optionally time-indexed graph dataset construction for power-system machine learning. PowerGraph can be used as a reference dataset, and pandapower can be used as a parsing or simulation tool, but the main output is a reusable graph-construction pipeline.
This prototype does not build a GNN model, does not implement a cascading-failure simulator, and does not claim to be the first power-grid graph dataset tool.
Graph Definition
Each graph sample represents:
G_t = (V, E, X_t, A_t, y_t)
where:
Vare bus nodes.Eare physical line and transformer branches.X_tcontains node features for scenario or timet.A_tcontains edge features for scenario or timet.y_tis an optional label.
For the MVP, TopoStateGrid builds a homogeneous bus-branch graph and exports a PyTorch Geometric Data object with:
data.xdata.edge_indexdata.edge_attrdata.y, optional label value; unlabeled graphs usedata.has_label=Falsewith placeholder label tensors for PyG batchingdata.network_iddata.sample_iddata.timestamp, optionaldata.scenario_id, optionaldata.contingency_id, optionaldata.metadata, a JSON string for source-specific metadata
Edges are stored bidirectionally so message passing can use both branch directions.
Source-specific metadata is stored as a JSON string rather than a Python dict so OPFData and MATPOWER graphs remain batchable together with PyTorch Geometric DataLoader.
Supported Inputs
Supported input sources in v1.1:
- OPFData JSON
- MATPOWER / PGLib
.m - pandapower
netobject - pandas DataFrame tables
- CSV tables
Current working input paths:
- Extracted OPFData JSON samples under
data/opfdata/**/group_*/example_*.json - Static MATPOWER/PGLib
.mfiles withmpc.busandmpc.branchtables
The MATPOWER parser accepts common matrix syntax: comma-delimited or whitespace-delimited rows, semicolons, % comments, scientific notation, multi-line matrices, and explicit empty matrices such as mpc.branch = [ ];. Missing required mpc.bus or mpc.branch declarations raise ValueError; an explicitly present empty mpc.branch is allowed for isolated-bus fixtures.
The OPFData parser validates that JSON is well-formed and that grid.nodes.bus is present and non-empty. Malformed JSON and missing required fields raise ValueError with the source path included.
The local environment used for this prototype contains extracted OPFData samples for pglib_opf_case14_ieee and pglib_opf_case30_ieee, plus a static PGLib MATPOWER case for pglib_opf_case118_ieee.
pandapower support is optional. Install it with:
python -m pip install -e ".[pandapower]"
The pandapower converter supports bus nodes and line/transformer branch edges. For lines, rate_a uses max_i_ka as an approximate rating proxy when no direct MVA rating is available. The graph remains homogeneous bus-branch only.
Graph rendering support is also optional. Install it with:
python -m pip install -e ".[visual]"
The renderer writes GIF or MP4 files from existing graph samples for inspection. It does not simulate grid dynamics.
Features
Node features:
bus_status, bus_type, pd, qd, vm, va, vmax, vmin, normalized_demand
For OPFData, pd and qd are aggregated from load nodes through load_link edges. vm and va are read from solved bus states when available. Missing values are filled with zero after NaN-safe conversion.
Edge features:
component_type, r, x, b_from, b_to, rate_a, pf, qf, pt, qt, loading_ratio, outage_flag
component_type is 0 for AC lines and 1 for transformers. OPFData solution flows are used when present. Static MATPOWER/PGLib cases include physical branch attributes, but solved flow fields are set to zero unless supplied by another source.
Static Topology vs Operating State
Topology and component attributes come from buses, lines, transformers, and branch parameters. Operating state comes from scenario-dependent demand, solved bus voltage, solved branch flow, and derived loading ratio.
For the same network, edge_index can remain fixed across scenarios while data.x and data.edge_attr vary by sample. This supports later supervised GNNs, contrastive or masked-feature self-supervision, and temporal forecasting when ordered timestamps are available.
Labels
topostategrid.labels.attach_stress_proxy_labels can attach temporary proxy labels:
risk_score = max_line_loading_ratio
y_cls = 1 if max_line_loading_ratio > 1.0 else 0
y_reg = risk_score
This is only a stress proxy for graph-construction experiments. It is not a real cascading-failure target.
Proxy label attachment is in-place and will not overwrite existing data.y, data.y_cls, data.y_reg, or data.risk_score by default. Pass overwrite=True only when replacing existing labels is intentional.
Splits
Implemented split strategies:
- Random split
- Time-based split when timestamps exist, otherwise input order
- Leave-One-Network-Out split with
create_lono_split(dataset, test_network="...")
LONO is useful for cross-topology evaluation, for example training on case14 and testing on case30 or case118.
Random and time-based splits require each positive-ratio split to receive at least one graph by default. Tiny datasets raise ValueError; pass allow_empty=True to permit empty splits. LONO raises ValueError when the test network is absent, when graph objects lack network_id, or when train/test would be empty.
Time-based splitting treats None, empty strings, and NaN-like timestamps as missing. It sorts only when all timestamps are valid and comparable; otherwise it falls back to input order. Temporal windows use the same timestamp rule by default through make_temporal_windows(..., sort_by_timestamp=True).
Normalization
FeatureNormalizer fits node and edge feature statistics only on the training split, then transforms train/validation/test graphs using the same statistics. This avoids data leakage from validation or test graphs.
Usage
Build one graph:
python examples/01_build_single_graph.py
Build multiple scenario graphs:
python examples/02_build_multiple_state_graphs.py
Create temporal windows over ordered samples:
python examples/03_create_temporal_windows.py
Create random, ordered, and LONO splits:
python examples/04_create_splits.py
Render a small graph-state sequence to GIF:
python examples/07_render_graph_animation.py
Render a 20-second GIF from pandapower's 300-bus benchmark:
python examples/08_render_large_pandapower_gif.py
Run tests:
python -m unittest discover -s tests -q
The tests are also compatible with pytest if it is installed.
Install optional test tooling with:
python -m pip install -e ".[test]"
On systems where the default matplotlib cache directory is not writable, use a writable cache directory for tests or rendering:
MPLCONFIGDIR=/private/tmp/topostategrid-mpl python -m unittest discover -s tests -q
On some macOS/conda environments, importing torch, torch_geometric, and numeric packages in one probe may expose an OpenMP runtime conflict from binary dependencies. Prefer a clean, consistent conda or virtualenv environment and avoid mixing package channels where possible.
pandapower may warn that numba is not installed. That warning only affects pandapower runtime speed; install numba separately if pandapower performance matters.
Output Files
The examples write to outputs/, including:
graphs.ptmetadata.csvgraphs_multi.ptmetadata_multi.csvsplit_random.jsonsplit_time.jsonsplit_lono.jsontemporal_windows.ptgraphs_tables.ptgraphs_pandapower.pt, when pandapower is installedtopostategrid_sequence.gif, when visualization dependencies are installedtopostategrid_case300_20s.gif, when pandapower and visualization dependencies are installedREADME_generated.md
Use topostategrid.export.load_graphs to load .pt files because it handles recent PyTorch weights_only defaults.
The example scripts assume the repository-local data/ layout used by this prototype and overwrite their corresponding files in outputs/ on repeated runs. Use the package functions directly when you need custom input paths or run-specific output directories.
v1.1 Table And pandapower Examples
Build from pandas DataFrames:
import pandas as pd
from topostategrid import build_graph_from_tables
bus_df = pd.DataFrame({
"bus_id": [1, 2, 3],
"bus_type": [3, 1, 1],
"pd": [0.0, 1.5, 0.8],
"qd": [0.0, 0.4, 0.2],
})
branch_df = pd.DataFrame({
"from_bus": [1, 2],
"to_bus": [2, 3],
"r": [0.01, 0.02],
"x": [0.05, 0.06],
})
data = build_graph_from_tables(
bus_df,
branch_df,
network_id="toy_3bus",
sample_id="sample_0",
)
Build from CSV tables:
from topostategrid import build_graph_from_csv_tables
data = build_graph_from_csv_tables(
"bus.csv",
"branch.csv",
network_id="toy_3bus",
)
Build from pandapower:
import pandapower as pp
from topostategrid import build_graph_from_pandapower
net = pp.create_empty_network()
b1 = pp.create_bus(net, vn_kv=110)
b2 = pp.create_bus(net, vn_kv=110)
b3 = pp.create_bus(net, vn_kv=110)
pp.create_ext_grid(net, b1)
pp.create_load(net, b2, p_mw=10.0, q_mvar=3.0)
pp.create_line_from_parameters(net, b1, b2, 1.0, 0.1, 0.2, 0.0, 0.4)
pp.create_line_from_parameters(net, b2, b3, 1.0, 0.1, 0.2, 0.0, 0.4)
pp.runpp(net)
data = build_graph_from_pandapower(net, network_id="pandapower_3bus")
Render constructed graph samples to GIF:
from topostategrid import render_graph_sequence
render_graph_sequence(
[data],
"outputs/topostategrid_sequence.gif",
node_value="vm",
edge_value="loading_ratio",
)
TopoStateGrid v1.1 still does not include a GNN model, cascading-failure simulator, .mat support, or heterogeneous graph construction.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file topostategrid-1.1.0.tar.gz.
File metadata
- Download URL: topostategrid-1.1.0.tar.gz
- Upload date:
- Size: 30.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c841df4a928901d793bcbee8275b7b7169105a878b8ba05825008c11c00d5b8
|
|
| MD5 |
48f737a4f028b46079fe644a03b9ad94
|
|
| BLAKE2b-256 |
0544b74e20106271de57fc001ea8160a42829f67ae4c6a59562b2ebc299d71e5
|
File details
Details for the file topostategrid-1.1.0-py3-none-any.whl.
File metadata
- Download URL: topostategrid-1.1.0-py3-none-any.whl
- Upload date:
- Size: 29.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
421fdfb1c5f177e9fb66029ddb0b76bbe8261b87023788976b5be5f8dddcd05d
|
|
| MD5 |
b775277fd619fb4e99d842d1faa1839e
|
|
| BLAKE2b-256 |
b37c687c2e1f7bedab57345c3a024f9dfe5bc4b006c686f07a2b1c56a2334eac
|