Grid Foundation Model

These details have not been verified by PyPI

Project description

GridFM logo

gridfm-graphkit

Coverage Python License

This library is brought to you by the GridFM team to train, finetune and interact with a foundation model for the electric power grid.

Installation

Create and activate a virtual environment (make sure you use the right python version = 3.10, 3.11 or 3.12. I highly recommend 3.12)

python -m venv venv
source venv/bin/activate

Install gridfm-graphkit from PyPI

pip install gridfm-graphkit

torch-scatter is a required dependency. It cannot be bundled in pyproject.toml because the correct wheel depends on your PyTorch and CUDA versions, so it must be installed separately.

Get PyTorch + CUDA version for torch-scatter

TORCH_CUDA_VERSION=$(python -c "import torch; print(torch.__version__ + ('+cpu' if torch.version.cuda is None else ''))")

Install the correct torch-scatter wheel

pip install torch-scatter -f https://data.pyg.org/whl/torch-${TORCH_CUDA_VERSION}.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-${TORCH_CUDA_VERSION}.html

For documentation generation and unit testing, install with the optional dev and test extras:

pip install "gridfm-graphkit[dev,test]"

CLI commands

Interface to train, fine-tune, evaluate, and run inference on GridFM models using YAML configs and MLflow tracking.

gridfm_graphkit <command> [OPTIONS]

Available commands:

train - Train a new model from scratch
finetune - Fine-tune an existing pre-trained model
evaluate - Evaluate model performance on a dataset
predict - Run inference and save predictions

Training Models

gridfm_graphkit train --config path/to/config.yaml

Arguments

Argument	Type	Description	Default
`--config`	`str`	Required. Path to the training configuration YAML file.	`None`
`--exp_name`	`str`	MLflow experiment name.	`timestamp`
`--run_name`	`str`	MLflow run name.	`run`
`--log_dir`	`str`	MLflow tracking/logging directory.	`mlruns`
`--data_path`	`str`	Root dataset directory.	`data`
`--compile [MODE]`	`str`	Enable `torch.compile` mode. Valid values: `default`, `reduce-overhead`, `max-autotune`, `max-autotune-no-cudagraphs`. If flag is passed without a value, mode is `default`.	`None`
`--bfloat16`	`flag`	Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`).	`False`
`--tf32`	`flag`	Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`.	`False`
`--dataset_wrapper`	`str`	Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`.	`None`
`--plugins`	`list[str]`	Python packages to import for plugin registration, e.g. `gridfm_graphkit_ee`.	`[]`
`--num_workers`	`int`	Override `data.workers` from YAML. Use `0` to debug worker crashes.	`None`
`--dataset_wrapper_cache_dir`	`str`	Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population.	`None`
`--profiler`	`str`	Enable Lightning profiler (`simple`, `advanced`, `pytorch`).	`None`
`--compute_dc_ac_metrics`	`flag`	Compute ground-truth AC/DC power balance metrics on the test split.	`False`
`--mp_context`	`str`	DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning.	`None`

Examples

Standard Training:

gridfm_graphkit train --config examples/config/case30_ieee_base.yaml --data_path examples/data

Fine-Tuning Models

gridfm_graphkit finetune --config path/to/config.yaml --model_path path/to/model.pt

Arguments

Argument	Type	Description	Default
`--config`	`str`	Required. Fine-tuning configuration file.	`None`
`--model_path`	`str`	Required. Path to a pre-trained model state dict.	`None`
`--exp_name`	`str`	MLflow experiment name.	`timestamp`
`--run_name`	`str`	MLflow run name.	`run`
`--log_dir`	`str`	MLflow logging directory.	`mlruns`
`--data_path`	`str`	Root dataset directory.	`data`
`--compile [MODE]`	`str`	Enable `torch.compile` mode. Valid values: `default`, `reduce-overhead`, `max-autotune`, `max-autotune-no-cudagraphs`. If flag is passed without a value, mode is `default`.	`None`
`--bfloat16`	`flag`	Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`).	`False`
`--tf32`	`flag`	Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`.	`False`
`--dataset_wrapper`	`str`	Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`.	`None`
`--plugins`	`list[str]`	Python packages to import for plugin registration, e.g. `gridfm_graphkit_ee`.	`[]`
`--num_workers`	`int`	Override `data.workers` from YAML. Use `0` to debug worker crashes.	`None`
`--dataset_wrapper_cache_dir`	`str`	Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population.	`None`
`--profiler`	`str`	Enable Lightning profiler (`simple`, `advanced`, `pytorch`).	`None`
`--compute_dc_ac_metrics`	`flag`	Compute ground-truth AC/DC power balance metrics on the test split.	`False`
`--mp_context`	`str`	DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning.	`None`

Evaluating Models

gridfm_graphkit evaluate --config path/to/eval.yaml --model_path path/to/model.pt

Arguments

Argument	Type	Description	Default
`--config`	`str`	Required. Path to evaluation config.	`None`
`--model_path`	`str`	Path to the trained model state dict.	`None`
`--normalizer_stats`	`str`	Path to `normalizer_stats.pt` from a training run. Restores `fit_on_train` normalizers from saved statistics instead of re-fitting on current split.	`None`
`--exp_name`	`str`	MLflow experiment name.	`timestamp`
`--run_name`	`str`	MLflow run name.	`run`
`--log_dir`	`str`	MLflow logging directory.	`mlruns`
`--data_path`	`str`	Dataset directory.	`data`
`--compile [MODE]`	`str`	Enable `torch.compile` mode. Valid values: `default`, `reduce-overhead`, `max-autotune`, `max-autotune-no-cudagraphs`. If flag is passed without a value, mode is `default`.	`None`
`--bfloat16`	`flag`	Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`).	`False`
`--tf32`	`flag`	Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`.	`False`
`--dataset_wrapper`	`str`	Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`.	`None`
`--plugins`	`list[str]`	Python packages to import for plugin registration, e.g. `gridfm_graphkit_ee`.	`[]`
`--num_workers`	`int`	Override `data.workers` from YAML. Use `0` to debug worker crashes.	`None`
`--dataset_wrapper_cache_dir`	`str`	Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population.	`None`
`--profiler`	`str`	Enable Lightning profiler (`simple`, `advanced`, `pytorch`).	`None`
`--compute_dc_ac_metrics`	`flag`	Compute ground-truth AC/DC power balance metrics on the test split.	`False`
`--save_output`	`flag`	Save predictions as `<grid_name>_predictions.parquet` under MLflow artifacts (`.../artifacts/test`).	`False`
`--mp_context`	`str`	DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning.	`None`

Example with saved normalizer stats

When evaluating a model on a dataset, you can pass the normalizer statistics from the original training run to ensure the same normalization parameters are used:

gridfm_graphkit evaluate \
  --config examples/config/HGNS_PF_datakit_case118.yaml \
  --model_path mlruns/<experiment_id>/<run_id>/artifacts/model/best_model_state_dict.pt \
  --normalizer_stats mlruns/<experiment_id>/<run_id>/artifacts/stats/normalizer_stats.pt \
  --data_path data

Note: The --normalizer_stats flag only affects normalizers with fit_strategy = "fit_on_train" (e.g. HeteroDataMVANormalizer). Per-sample normalizers (HeteroDataPerSampleMVANormalizer) always recompute their statistics from the current dataset regardless of this flag.

Running Predictions

gridfm_graphkit predict --config path/to/config.yaml --model_path path/to/model.pt

Arguments

Argument	Type	Description	Default
`--config`	`str`	Required. Path to prediction config file.	`None`
`--model_path`	`str`	Path to trained model state dict. Optional; may be defined in config.	`None`
`--normalizer_stats`	`str`	Path to `normalizer_stats.pt` from a training run. Restores `fit_on_train` normalizers from saved statistics.	`None`
`--exp_name`	`str`	MLflow experiment name.	`timestamp`
`--run_name`	`str`	MLflow run name.	`run`
`--log_dir`	`str`	MLflow logging directory.	`mlruns`
`--data_path`	`str`	Dataset directory.	`data`
`--dataset_wrapper`	`str`	Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`.	`None`
`--plugins`	`list[str]`	Python packages to import for plugin registration, e.g. `gridfm_graphkit_ee`.	`[]`
`--num_workers`	`int`	Override `data.workers` from YAML. Use `0` to debug worker crashes.	`None`
`--dataset_wrapper_cache_dir`	`str`	Disk cache directory for dataset wrapper; cache is loaded from here when present and saved after first population.	`None`
`--output_path`	`str`	Directory where predictions are saved as `<grid_name>_predictions.parquet`.	`data`
`--compile [MODE]`	`str`	Enable `torch.compile` mode. Valid values: `default`, `reduce-overhead`, `max-autotune`, `max-autotune-no-cudagraphs`. If flag is passed without a value, mode is `default`.	`None`
`--bfloat16`	`flag`	Cast model to `torch.bfloat16` (`model.to(torch.bfloat16)`).	`False`
`--tf32`	`flag`	Enable TF32 on Ampere+ GPUs via `torch.set_float32_matmul_precision("high")`.	`False`
`--profiler`	`str`	Enable Lightning profiler (`simple`, `advanced`, `pytorch`).	`None`
`--mp_context`	`str`	DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning.	`None`

Benchmarking Dataloader Throughput

gridfm_graphkit benchmark --config path/to/config.yaml

Arguments

Argument	Type	Description	Default
`--config`	`str`	Required. Path to configuration YAML file.	`None`
`--data_path`	`str`	Root dataset directory.	`data`
`--epochs`	`int`	Number of epochs to iterate through the train dataloader.	`3`
`--dataset_wrapper`	`str`	Registered dataset wrapper name (see `DATASET_WRAPPER_REGISTRY`), e.g. `SharedMemoryCacheDataset`.	`None`
`--dataset_wrapper_cache_dir`	`str`	Directory for dataset wrapper disk cache.	`None`
`--num_workers`	`int`	Override `data.workers` from YAML.	`None`
`--plugins`	`list[str]`	Python packages to import for plugin registration.	`[]`
`--mp_context`	`str`	DataLoader multiprocessing start method (`spawn`, `fork`, `forkserver`). Defaults to PyTorch's automatic choice. On Linux, `spawn` is recommended for safety (CUDA + fork is unsafe); other choices emit a warning.	`None`

Use built-in help for full command details:

gridfm_graphkit --help
gridfm_graphkit <command> --help

Running Tests

Unit and Integration Tests

Install the test dependencies first (if not already done):

pip install -e .[dev,test]

Run the full unit test suite:

pytest ./tests

Run the base set integration tests:

pytest ./integrationtests/test_base_set.py

Running Base Set Tests on an LSF Cluster (GPU)

To submit the base set integration tests as an interactive LSF job with GPU access, use bsub. Adjust the paths to match your environment:

bsub -gpu "num=1" \
     -n 16 \
     -R "rusage[mem=32GB] span[hosts=1]" \
     -Is \
     -J gridfm_base_set_tests \
     /bin/bash -c "
       cd /path/to/gridfm-graphkit && \
       export PATH=/path/to/cuda/bin:\$PATH \
               CUDA_HOME=/path/to/cuda \
               LD_LIBRARY_PATH=/path/to/cuda/lib64:\$LD_LIBRARY_PATH && \
       source /path/to/venv/bin/activate && \
       pytest ./integrationtests/test_base_set.py
     "

Key bsub options used above:

Option	Description
`-gpu "num=1"`	Request 1 GPU
`-n 16`	Request 16 CPU slots
`-R "rusage[mem=32GB] span[hosts=1]"`	Reserve 32 GB of memory on a single host
`-Is`	Run as an interactive shell session
`-J <job_name>`	Assign a name to the job

Concrete example (adapt paths to your cluster setup):

bsub -gpu "num=1" -n 16 -R "rusage[mem=32GB] span[hosts=1]" -Is -J hpo_trial_190 /bin/bash -c "cd /dccstor/terratorch/users/rkie/gitco/gridfm-graphkit && export PATH=/opt/share/cuda-12.8.1/bin:\$PATH CUDA_HOME=/opt/share/cuda-12.8.1 LD_LIBRARY_PATH=/opt/share/cuda-12.8.1/lib64:\$LD_LIBRARY_PATH && source /u/rkie/venvs/venv_gridfm-graphkit/bin/activate && pytest ./integrationtests/test_base_set.py"

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.8.1

Jul 13, 2026

0.8.0

Jul 13, 2026

0.8.0rc3 pre-release

Jun 30, 2026

0.8.0rc1 pre-release

Jun 26, 2026

0.0.7

Jun 12, 2026

0.0.7rc1 pre-release

Jun 12, 2026

0.0.6

Sep 3, 2025

0.0.5

Sep 1, 2025

0.0.4

Aug 22, 2025

0.0.3

Aug 5, 2025

0.0.2

Jun 24, 2025

0.0.2a0 pre-release

Jun 24, 2025

0.0.1

Jun 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gridfm_graphkit-0.8.1.tar.gz (90.7 kB view details)

Uploaded Jul 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gridfm_graphkit-0.8.1-py3-none-any.whl (99.1 kB view details)

Uploaded Jul 13, 2026 Python 3

File details

Details for the file gridfm_graphkit-0.8.1.tar.gz.

File metadata

Download URL: gridfm_graphkit-0.8.1.tar.gz
Upload date: Jul 13, 2026
Size: 90.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for gridfm_graphkit-0.8.1.tar.gz
Algorithm	Hash digest
SHA256	`7311ace3f3ba75728ad25b4287e0f760320f9516876f8a22f0abaa5d45e5d624`
MD5	`b457ea1dc49f7e6515ddfe8f3ce27da9`
BLAKE2b-256	`7b7b9cd8b4a6a2c99e5b4fa3addf7b8cd860dee335731e61f4dcc862b01e8ffb`

See more details on using hashes here.

File details

Details for the file gridfm_graphkit-0.8.1-py3-none-any.whl.

File metadata

Download URL: gridfm_graphkit-0.8.1-py3-none-any.whl
Upload date: Jul 13, 2026
Size: 99.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for gridfm_graphkit-0.8.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2065037edb4a4e08d03acf4b22ec4ccbdcad3f63a34e93f785010353f09b37a0`
MD5	`06d96fdde3b14f1ed7b44c5bfd534986`
BLAKE2b-256	`c1bb26acc1d4c97f19b6402ef81675e45b2cfdb6c64f63ee149ceb082383f836`

See more details on using hashes here.

gridfm-graphkit 0.8.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Installation

CLI commands

Training Models

Arguments

Examples

Fine-Tuning Models

Arguments

Evaluating Models

Arguments

Example with saved normalizer stats

Running Predictions

Arguments

Benchmarking Dataloader Throughput

Arguments

Running Tests

Unit and Integration Tests

Running Base Set Tests on an LSF Cluster (GPU)

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes